{
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "id": "8f18ba38-4271-490e-a3f1-7c05bcba65e7",
      "metadata": {},
      "source": [
        "## Linking a dataset of real historical persons\n",
        "In this example, we deduplicate a more realistic dataset. The data is based on historical persons scraped from wikidata. Duplicate records are introduced with a variety of errors introduced."
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "id": "8b53cd2f-c007-4997-9ecd-c27930bbcc3a",
      "metadata": {},
      "source": [
        "Create a boto3 session to be used within the linker"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "b77a3a3c-5c7f-4e18-8482-95c090b18b79",
      "metadata": {},
      "outputs": [],
      "source": [
        "import boto3\n",
        "\n",
        "boto3_session = boto3.Session(region_name=\"eu-west-1\")"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "id": "d1aa6941-e7f4-4ac3-a166-ae7357c79198",
      "metadata": {},
      "source": [
        "## AthenaLinker Setup\n",
        "\n",
        "To work nicely with Athena, you need to outline various filepaths, buckets and the database(s) you wish to interact with.\n",
        "<hr>\n",
        "\n",
        "**The AthenaLinker has three required inputs:**\n",
        "* input_table_or_tables - the input table to use for linking. This can either be a table in a database or a pandas dataframe\n",
        "* output_database - the database to output all of your splink tables to.\n",
        "* output_bucket - the s3 bucket you wish any parquet files produced by splink to be output to.\n",
        "\n",
        "**and two optional inputs:**\n",
        "* output_filepath - the s3 filepath to output files to. This is an extension of output_bucket and dictate the full filepath your files will be output to.\n",
        "* input_table_aliases - the name of your table within your database, should you choose to use a pandas df as an input."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "023f3947-9cb9-44db-a7f9-f7dab41def34",
      "metadata": {},
      "outputs": [],
      "source": [
        "# Set the output bucket and the additional filepath to write outputs to\n",
        "############################################\n",
        "# EDIT THESE BEFORE ATTEMPTING TO RUN THIS #\n",
        "############################################\n",
        "\n",
        "from splink.backends.athena import AthenaAPI\n",
        "\n",
        "\n",
        "bucket = \"MYTESTBUCKET\"\n",
        "database = \"MYTESTDATABASE\"\n",
        "filepath = \"MYTESTFILEPATH\"  # file path inside of your bucket\n",
        "\n",
        "aws_filepath = f\"s3://{bucket}/{filepath}\"\n",
        "db_api = AthenaAPI(\n",
        "    boto3_session,\n",
        "    output_bucket=bucket,\n",
        "    output_database=database,\n",
        "    output_filepath=filepath,\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "52c2e9a0",
      "metadata": {},
      "outputs": [],
      "source": [
        "import splink.comparison_library as cl\n",
        "from splink import block_on\n",
        "\n",
        "from splink import Linker, SettingsCreator, splink_datasets\n",
        "\n",
        "df = splink_datasets.historical_50k\n",
        "\n",
        "settings = SettingsCreator(\n",
        "    link_type=\"dedupe_only\",\n",
        "    blocking_rules_to_generate_predictions=[\n",
        "        block_on(\"first_name\", \"surname\"),\n",
        "        block_on(\"surname\", \"dob\"),\n",
        "    ],\n",
        "    comparisons=[\n",
        "        cl.ExactMatch(\"first_name\").configure(term_frequency_adjustments=True),\n",
        "        cl.LevenshteinAtThresholds(\"surname\", [1, 3]),\n",
        "        cl.LevenshteinAtThresholds(\"dob\", [1, 2]),\n",
        "        cl.LevenshteinAtThresholds(\"postcode_fake\", [1, 2]),\n",
        "        cl.ExactMatch(\"birth_place\").configure(term_frequency_adjustments=True),\n",
        "        cl.ExactMatch(\"occupation\").configure(term_frequency_adjustments=True),\n",
        "    ],\n",
        "    retain_intermediate_calculation_columns=True,\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "ead9c4a2",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "<style>\n",
              "  #altair-viz-c11379e3862b4d19a24f604c51491a84.vega-embed {\n",
              "    width: 100%;\n",
              "    display: flex;\n",
              "  }\n",
              "\n",
              "  #altair-viz-c11379e3862b4d19a24f604c51491a84.vega-embed details,\n",
              "  #altair-viz-c11379e3862b4d19a24f604c51491a84.vega-embed details summary {\n",
              "    position: relative;\n",
              "  }\n",
              "</style>\n",
              "<div id=\"altair-viz-c11379e3862b4d19a24f604c51491a84\"></div>\n",
              "<script type=\"text/javascript\">\n",
              "  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
              "  (function(spec, embedOpt){\n",
              "    let outputDiv = document.currentScript.previousElementSibling;\n",
              "    if (outputDiv.id !== \"altair-viz-c11379e3862b4d19a24f604c51491a84\") {\n",
              "      outputDiv = document.getElementById(\"altair-viz-c11379e3862b4d19a24f604c51491a84\");\n",
              "    }\n",
              "    const paths = {\n",
              "      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n",
              "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n",
              "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.17.0?noext\",\n",
              "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n",
              "    };\n",
              "\n",
              "    function maybeLoadScript(lib, version) {\n",
              "      var key = `${lib.replace(\"-\", \"\")}_version`;\n",
              "      return (VEGA_DEBUG[key] == version) ?\n",
              "        Promise.resolve(paths[lib]) :\n",
              "        new Promise(function(resolve, reject) {\n",
              "          var s = document.createElement('script');\n",
              "          document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
              "          s.async = true;\n",
              "          s.onload = () => {\n",
              "            VEGA_DEBUG[key] = version;\n",
              "            return resolve(paths[lib]);\n",
              "          };\n",
              "          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
              "          s.src = paths[lib];\n",
              "        });\n",
              "    }\n",
              "\n",
              "    function showError(err) {\n",
              "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
              "      throw err;\n",
              "    }\n",
              "\n",
              "    function displayChart(vegaEmbed) {\n",
              "      vegaEmbed(outputDiv, spec, embedOpt)\n",
              "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
              "    }\n",
              "\n",
              "    if(typeof define === \"function\" && define.amd) {\n",
              "      requirejs.config({paths});\n",
              "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
              "    } else {\n",
              "      maybeLoadScript(\"vega\", \"5\")\n",
              "        .then(() => maybeLoadScript(\"vega-lite\", \"5.17.0\"))\n",
              "        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
              "        .catch(showError)\n",
              "        .then(() => displayChart(vegaEmbed));\n",
              "    }\n",
              "  })({\"config\": {\"view\": {\"continuousWidth\": 400, \"continuousHeight\": 300}}, \"vconcat\": [{\"hconcat\": [{\"mark\": {\"type\": \"line\", \"interpolate\": \"step-after\"}, \"data\": {\"values\": [{\"percentile_ex_nulls\": 0.9449625015258789, \"percentile_inc_nulls\": 0.9450353980064392, \"value_count\": 2780, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 2780, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.8907960653305054, \"percentile_inc_nulls\": 0.8909407258033752, \"value_count\": 2736, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 2736, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.8621290326118469, \"percentile_inc_nulls\": 0.8623116612434387, \"value_count\": 1448, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1448, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.8341153264045715, \"percentile_inc_nulls\": 0.8343350887298584, \"value_count\": 1415, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1415, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.8082596063613892, \"percentile_inc_nulls\": 0.8085135817527771, \"value_count\": 1306, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1306, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.7832155227661133, \"percentile_inc_nulls\": 0.7835026979446411, \"value_count\": 1265, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1265, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.7582308650016785, \"percentile_inc_nulls\": 0.7585511207580566, \"value_count\": 1262, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1262, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6269921064376831, \"percentile_inc_nulls\": 0.6274862289428711, \"value_count\": 509, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 509, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6171329021453857, \"percentile_inc_nulls\": 0.6176400780677795, \"value_count\": 498, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 498, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6079269647598267, \"percentile_inc_nulls\": 0.6084463596343994, \"value_count\": 465, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 465, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5987408757209778, \"percentile_inc_nulls\": 0.5992723703384399, \"value_count\": 464, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 464, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5904456377029419, \"percentile_inc_nulls\": 0.5909881591796875, \"value_count\": 419, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 419, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5831403136253357, \"percentile_inc_nulls\": 0.5836925506591797, \"value_count\": 369, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 369, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5762308835983276, \"percentile_inc_nulls\": 0.5767922401428223, \"value_count\": 349, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 349, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5698164701461792, \"percentile_inc_nulls\": 0.5703862905502319, \"value_count\": 324, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 324, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5634812116622925, \"percentile_inc_nulls\": 0.5640594959259033, \"value_count\": 320, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 320, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5573043823242188, \"percentile_inc_nulls\": 0.557890772819519, \"value_count\": 312, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 312, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5513848066329956, \"percentile_inc_nulls\": 0.551979124546051, \"value_count\": 299, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 299, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5457029342651367, \"percentile_inc_nulls\": 0.5463047027587891, \"value_count\": 287, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 287, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5401991605758667, \"percentile_inc_nulls\": 0.5408082604408264, \"value_count\": 278, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 278, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5348538160324097, \"percentile_inc_nulls\": 0.5354700088500977, \"value_count\": 270, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 270, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5298252105712891, \"percentile_inc_nulls\": 0.5304480195045471, \"value_count\": 254, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 254, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5250935554504395, \"percentile_inc_nulls\": 0.5257226228713989, \"value_count\": 239, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 239, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.7341569066047668, \"percentile_inc_nulls\": 0.7345091104507446, \"value_count\": 1216, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1216, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.7161211967468262, \"percentile_inc_nulls\": 0.7164973020553589, \"value_count\": 911, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 911, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.698620080947876, \"percentile_inc_nulls\": 0.6990193128585815, \"value_count\": 884, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 884, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6826632022857666, \"percentile_inc_nulls\": 0.6830835342407227, \"value_count\": 806, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 806, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6697155237197876, \"percentile_inc_nulls\": 0.670153021812439, \"value_count\": 654, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 654, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6573221683502197, \"percentile_inc_nulls\": 0.6577761173248291, \"value_count\": 626, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 626, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6471858024597168, \"percentile_inc_nulls\": 0.6476531028747559, \"value_count\": 512, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 512, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.6370691657066345, \"percentile_inc_nulls\": 0.6375499367713928, \"value_count\": 511, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 511, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5204806923866272, \"percentile_inc_nulls\": 0.5211158990859985, \"value_count\": 233, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 233, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5164815187454224, \"percentile_inc_nulls\": 0.5171220302581787, \"value_count\": 202, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 202, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.51287841796875, \"percentile_inc_nulls\": 0.5135236978530884, \"value_count\": 182, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 182, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5092949867248535, \"percentile_inc_nulls\": 0.5099450349807739, \"value_count\": 181, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 181, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.5058304071426392, \"percentile_inc_nulls\": 0.5064850449562073, \"value_count\": 175, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 175, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.49898040294647217, \"percentile_inc_nulls\": 0.49964410066604614, \"value_count\": 173, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 346, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.49563461542129517, \"percentile_inc_nulls\": 0.4963027238845825, \"value_count\": 169, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 169, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4923086166381836, \"percentile_inc_nulls\": 0.4929811358451843, \"value_count\": 168, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 168, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4890221953392029, \"percentile_inc_nulls\": 0.4896990656852722, \"value_count\": 166, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 166, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.48587435483932495, \"percentile_inc_nulls\": 0.48655539751052856, \"value_count\": 159, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 159, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4827859401702881, \"percentile_inc_nulls\": 0.4834710955619812, \"value_count\": 156, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 156, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.47664862871170044, \"percentile_inc_nulls\": 0.4773419499397278, \"value_count\": 155, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 310, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4705509543418884, \"percentile_inc_nulls\": 0.47125232219696045, \"value_count\": 154, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 308, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4675813317298889, \"percentile_inc_nulls\": 0.4682866334915161, \"value_count\": 150, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 150, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.46471065282821655, \"percentile_inc_nulls\": 0.4654197692871094, \"value_count\": 145, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 145, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.46195876598358154, \"percentile_inc_nulls\": 0.46267151832580566, \"value_count\": 139, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 139, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.459286093711853, \"percentile_inc_nulls\": 0.4600023627281189, \"value_count\": 135, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 135, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.45405954122543335, \"percentile_inc_nulls\": 0.45478272438049316, \"value_count\": 132, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 264, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.451485812664032, \"percentile_inc_nulls\": 0.45221245288848877, \"value_count\": 130, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 130, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4464572072029114, \"percentile_inc_nulls\": 0.44719046354293823, \"value_count\": 127, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 254, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.44402211904525757, \"percentile_inc_nulls\": 0.4447585940361023, \"value_count\": 123, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 123, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.44164639711380005, \"percentile_inc_nulls\": 0.44238603115081787, \"value_count\": 120, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 120, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4393102526664734, \"percentile_inc_nulls\": 0.44005298614501953, \"value_count\": 118, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 118, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4370335340499878, \"percentile_inc_nulls\": 0.4377792477607727, \"value_count\": 115, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 115, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4347766041755676, \"percentile_inc_nulls\": 0.4355252981185913, \"value_count\": 114, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 114, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4325592517852783, \"percentile_inc_nulls\": 0.43331092596054077, \"value_count\": 112, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 112, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4304013252258301, \"percentile_inc_nulls\": 0.43115586042404175, \"value_count\": 109, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 109, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4283621311187744, \"percentile_inc_nulls\": 0.42911940813064575, \"value_count\": 103, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 103, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.42634278535842896, \"percentile_inc_nulls\": 0.4271026849746704, \"value_count\": 102, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 102, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4223436713218689, \"percentile_inc_nulls\": 0.42310887575149536, \"value_count\": 101, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 202, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4203639030456543, \"percentile_inc_nulls\": 0.4211317300796509, \"value_count\": 100, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 100, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.418443500995636, \"percentile_inc_nulls\": 0.4192138910293579, \"value_count\": 97, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 97, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.25091564655303955, \"percentile_inc_nulls\": 0.25190794467926025, \"value_count\": 23, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 161, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.24612462520599365, \"percentile_inc_nulls\": 0.24712324142456055, \"value_count\": 22, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 242, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2411355972290039, \"percentile_inc_nulls\": 0.24214082956314087, \"value_count\": 21, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 252, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.23638415336608887, \"percentile_inc_nulls\": 0.23739570379257202, \"value_count\": 20, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 240, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.22998952865600586, \"percentile_inc_nulls\": 0.23100954294204712, \"value_count\": 19, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 323, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2228623628616333, \"percentile_inc_nulls\": 0.22389179468154907, \"value_count\": 18, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 360, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2188236117362976, \"percentile_inc_nulls\": 0.21985840797424316, \"value_count\": 17, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 204, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.20995426177978516, \"percentile_inc_nulls\": 0.21100085973739624, \"value_count\": 16, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 448, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.20193618535995483, \"percentile_inc_nulls\": 0.20299339294433594, \"value_count\": 15, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 405, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.194452702999115, \"percentile_inc_nulls\": 0.1955198049545288, \"value_count\": 14, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 378, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.18158423900604248, \"percentile_inc_nulls\": 0.18266832828521729, \"value_count\": 13, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 650, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.17279404401779175, \"percentile_inc_nulls\": 0.17388981580734253, \"value_count\": 12, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 444, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.16538971662521362, \"percentile_inc_nulls\": 0.16649532318115234, \"value_count\": 11, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 374, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.15588682889938354, \"percentile_inc_nulls\": 0.15700501203536987, \"value_count\": 10, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 480, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.14394885301589966, \"percentile_inc_nulls\": 0.14508283138275146, \"value_count\": 9, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 603, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.13143670558929443, \"percentile_inc_nulls\": 0.13258731365203857, \"value_count\": 8, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 632, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.12145870923995972, \"percentile_inc_nulls\": 0.12262248992919922, \"value_count\": 7, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 504, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.11076796054840088, \"percentile_inc_nulls\": 0.11194592714309692, \"value_count\": 6, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 540, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.10017621517181396, \"percentile_inc_nulls\": 0.10136818885803223, \"value_count\": 5, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 535, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.08655542135238647, \"percentile_inc_nulls\": 0.08776545524597168, \"value_count\": 4, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 688, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.0702817440032959, \"percentile_inc_nulls\": 0.07151329517364502, \"value_count\": 3, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 822, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.047276854515075684, \"percentile_inc_nulls\": 0.04853886365890503, \"value_count\": 2, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1162, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.0, \"percentile_inc_nulls\": 0.0013247132301330566, \"value_count\": 1, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 2388, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4165429472923279, \"percentile_inc_nulls\": 0.41731584072113037, \"value_count\": 96, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 96, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4146621823310852, \"percentile_inc_nulls\": 0.4154375195503235, \"value_count\": 95, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 95, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4128011465072632, \"percentile_inc_nulls\": 0.4135790467262268, \"value_count\": 94, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 94, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4055156111717224, \"percentile_inc_nulls\": 0.4063031077384949, \"value_count\": 92, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 368, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.4001108407974243, \"percentile_inc_nulls\": 0.4009055495262146, \"value_count\": 91, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 273, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.39832907915115356, \"percentile_inc_nulls\": 0.3991261124610901, \"value_count\": 90, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 90, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3948447108268738, \"percentile_inc_nulls\": 0.395646333694458, \"value_count\": 88, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 176, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.39139991998672485, \"percentile_inc_nulls\": 0.392206072807312, \"value_count\": 87, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 174, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.38971710205078125, \"percentile_inc_nulls\": 0.3905255198478699, \"value_count\": 85, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 85, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3864702582359314, \"percentile_inc_nulls\": 0.3872830271720886, \"value_count\": 82, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 164, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3848666548728943, \"percentile_inc_nulls\": 0.38568150997161865, \"value_count\": 81, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 81, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3816990256309509, \"percentile_inc_nulls\": 0.38251811265945435, \"value_count\": 80, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 160, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3801349997520447, \"percentile_inc_nulls\": 0.38095617294311523, \"value_count\": 79, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 79, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.37859082221984863, \"percentile_inc_nulls\": 0.3794139623641968, \"value_count\": 78, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 78, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.37706637382507324, \"percentile_inc_nulls\": 0.3778916001319885, \"value_count\": 77, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 77, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.37556177377700806, \"percentile_inc_nulls\": 0.3763889670372009, \"value_count\": 76, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 76, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3740769624710083, \"percentile_inc_nulls\": 0.374906063079834, \"value_count\": 75, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 75, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3726118803024292, \"percentile_inc_nulls\": 0.37344300746917725, \"value_count\": 74, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 74, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3697214722633362, \"percentile_inc_nulls\": 0.3705563545227051, \"value_count\": 73, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 146, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.36687058210372925, \"percentile_inc_nulls\": 0.36770927906036377, \"value_count\": 72, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 144, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3654649257659912, \"percentile_inc_nulls\": 0.36630553007125854, \"value_count\": 71, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 71, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3640791177749634, \"percentile_inc_nulls\": 0.364921510219574, \"value_count\": 70, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 70, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.362713098526001, \"percentile_inc_nulls\": 0.36355727910995483, \"value_count\": 69, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 69, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3586743474006653, \"percentile_inc_nulls\": 0.3595238924026489, \"value_count\": 68, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 204, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.35734790563583374, \"percentile_inc_nulls\": 0.35819923877716064, \"value_count\": 67, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 67, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.35606104135513306, \"percentile_inc_nulls\": 0.35691410303115845, \"value_count\": 65, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 65, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3547940254211426, \"percentile_inc_nulls\": 0.3556486964225769, \"value_count\": 64, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 64, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3522995114326477, \"percentile_inc_nulls\": 0.35315752029418945, \"value_count\": 63, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 126, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3510720729827881, \"percentile_inc_nulls\": 0.35193169116973877, \"value_count\": 62, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 62, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.34744906425476074, \"percentile_inc_nulls\": 0.34831351041793823, \"value_count\": 61, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 183, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3427768349647522, \"percentile_inc_nulls\": 0.34364742040634155, \"value_count\": 59, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 236, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.33933204412460327, \"percentile_inc_nulls\": 0.34020721912384033, \"value_count\": 58, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 174, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3382035493850708, \"percentile_inc_nulls\": 0.33908021450042725, \"value_count\": 57, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 57, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.33709490299224854, \"percentile_inc_nulls\": 0.33797305822372437, \"value_count\": 56, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 56, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.33602583408355713, \"percentile_inc_nulls\": 0.3369053602218628, \"value_count\": 54, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 54, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3328779935836792, \"percentile_inc_nulls\": 0.33376169204711914, \"value_count\": 53, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 159, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3287600874900818, \"percentile_inc_nulls\": 0.32964926958084106, \"value_count\": 52, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 208, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.32674068212509155, \"percentile_inc_nulls\": 0.3276325464248657, \"value_count\": 51, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 102, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.32575082778930664, \"percentile_inc_nulls\": 0.32664400339126587, \"value_count\": 50, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 50, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3238106369972229, \"percentile_inc_nulls\": 0.32470637559890747, \"value_count\": 49, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 98, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.32095980644226074, \"percentile_inc_nulls\": 0.32185930013656616, \"value_count\": 48, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 144, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3200293183326721, \"percentile_inc_nulls\": 0.3209300637245178, \"value_count\": 47, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 47, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.31729722023010254, \"percentile_inc_nulls\": 0.31820160150527954, \"value_count\": 46, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 138, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.31373363733291626, \"percentile_inc_nulls\": 0.3146427273750305, \"value_count\": 45, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 180, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.30937814712524414, \"percentile_inc_nulls\": 0.3102930188179016, \"value_count\": 44, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 220, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.3076755404472351, \"percentile_inc_nulls\": 0.30859267711639404, \"value_count\": 43, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 86, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.30434954166412354, \"percentile_inc_nulls\": 0.30527108907699585, \"value_count\": 42, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 168, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.30353784561157227, \"percentile_inc_nulls\": 0.30446046590805054, \"value_count\": 41, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 41, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2987864017486572, \"percentile_inc_nulls\": 0.2997152805328369, \"value_count\": 40, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 240, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.29569798707962036, \"percentile_inc_nulls\": 0.2966309189796448, \"value_count\": 39, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 156, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2934410572052002, \"percentile_inc_nulls\": 0.29437702894210815, \"value_count\": 38, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 114, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2912434935569763, \"percentile_inc_nulls\": 0.29218238592147827, \"value_count\": 37, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 111, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2891647219657898, \"percentile_inc_nulls\": 0.2901063561439514, \"value_count\": 35, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 105, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.28579914569854736, \"percentile_inc_nulls\": 0.28674525022506714, \"value_count\": 34, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 170, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.281879186630249, \"percentile_inc_nulls\": 0.28283047676086426, \"value_count\": 33, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 198, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2799786329269409, \"percentile_inc_nulls\": 0.2809324264526367, \"value_count\": 32, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 96, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2775236964225769, \"percentile_inc_nulls\": 0.27848076820373535, \"value_count\": 31, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 124, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.27336621284484863, \"percentile_inc_nulls\": 0.2743287682533264, \"value_count\": 30, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 210, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.27049553394317627, \"percentile_inc_nulls\": 0.2714619040489197, \"value_count\": 29, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 145, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.26772385835647583, \"percentile_inc_nulls\": 0.2686939239501953, \"value_count\": 28, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 140, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2639821171760559, \"percentile_inc_nulls\": 0.264957070350647, \"value_count\": 27, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 189, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.26037895679473877, \"percentile_inc_nulls\": 0.26135867834091187, \"value_count\": 26, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 182, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.2579042315483093, \"percentile_inc_nulls\": 0.25888729095458984, \"value_count\": 25, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 125, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 0.25410306453704834, \"percentile_inc_nulls\": 0.25509113073349, \"value_count\": 24, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 192, \"distinct_value_count\": 4413}, {\"percentile_ex_nulls\": 1.0, \"percentile_inc_nulls\": 1.0, \"value_count\": 2780, \"group_name\": \"first_name\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 2780, \"distinct_value_count\": 4413}]}, \"encoding\": {\"tooltip\": [{\"field\": \"value_count\", \"type\": \"quantitative\"}, {\"field\": \"percentile_ex_nulls\", \"type\": \"quantitative\"}, {\"field\": \"percentile_inc_nulls\", \"type\": \"quantitative\"}, {\"field\": \"total_non_null_rows\", \"type\": \"quantitative\"}, {\"field\": \"total_rows_inc_nulls\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"percentile_ex_nulls\", \"sort\": \"descending\", \"title\": \"Percentile\", \"type\": \"quantitative\"}, \"y\": {\"field\": \"value_count\", \"title\": \"Count of values\", \"type\": \"quantitative\"}}, \"title\": {\"text\": \"Distribution of counts of values in column first_name\", \"subtitle\": \"In this col, 67 values (0.1%) are null and there are 4413 distinct values\"}}, {\"mark\": \"bar\", \"data\": {\"values\": [{\"value_count\": 2780, \"group_name\": \"first_name\", \"value\": \"william\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 2736, \"group_name\": \"first_name\", \"value\": \"john\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1448, \"group_name\": \"first_name\", \"value\": \"thomas\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1415, \"group_name\": \"first_name\", \"value\": \"george\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1306, \"group_name\": \"first_name\", \"value\": \"henry\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1265, \"group_name\": \"first_name\", \"value\": \"james\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1262, \"group_name\": \"first_name\", \"value\": \"sir\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1216, \"group_name\": \"first_name\", \"value\": \"charles\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 911, \"group_name\": \"first_name\", \"value\": \"edward\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 884, \"group_name\": \"first_name\", \"value\": \"robert\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}]}, \"encoding\": {\"tooltip\": [{\"field\": \"value\", \"type\": \"nominal\"}, {\"field\": \"value_count\", \"type\": \"quantitative\"}, {\"field\": \"total_non_null_rows\", \"type\": \"quantitative\"}, {\"field\": \"total_rows_inc_nulls\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"value\", \"sort\": \"-y\", \"title\": null, \"type\": \"nominal\"}, \"y\": {\"field\": \"value_count\", \"title\": \"Value count\", \"type\": \"quantitative\"}}, \"title\": \"Top 10 values by value count\"}, {\"mark\": \"bar\", \"data\": {\"values\": [{\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"woollcomhe\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"didley\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"bakker\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"gasquoine\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"harild\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"bhownagri\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"kewley\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"stanldy\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"fance\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}, {\"value_count\": 1, \"group_name\": \"first_name\", \"value\": \"abgayne\", \"total_non_null_rows\": 50511, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 4413}]}, \"encoding\": {\"tooltip\": [{\"field\": \"value\", \"type\": \"nominal\"}, {\"field\": \"value_count\", \"type\": \"quantitative\"}, {\"field\": \"total_non_null_rows\", \"type\": \"quantitative\"}, {\"field\": \"total_rows_inc_nulls\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"value\", \"sort\": \"-y\", \"title\": null, \"type\": \"nominal\"}, \"y\": {\"field\": \"value_count\", \"scale\": {\"domain\": [0, 2780]}, \"title\": \"Value count\", \"type\": \"quantitative\"}}, \"title\": \"Bottom 10 values by value count\"}]}, {\"hconcat\": [{\"mark\": {\"type\": \"line\", \"interpolate\": \"step-after\"}, \"data\": {\"values\": [{\"percentile_ex_nulls\": 0.07418102025985718, \"percentile_inc_nulls\": 0.15682709217071533, \"value_count\": 88, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 88, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0722922682762146, \"percentile_inc_nulls\": 0.15510696172714233, \"value_count\": 87, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 87, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.07042527198791504, \"percentile_inc_nulls\": 0.15340662002563477, \"value_count\": 86, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 86, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.06690835952758789, \"percentile_inc_nulls\": 0.1502036452293396, \"value_count\": 81, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 162, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.06521505117416382, \"percentile_inc_nulls\": 0.14866149425506592, \"value_count\": 78, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 78, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.06356513500213623, \"percentile_inc_nulls\": 0.14715886116027832, \"value_count\": 76, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 76, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.06202375888824463, \"percentile_inc_nulls\": 0.14575505256652832, \"value_count\": 71, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 71, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.06052577495574951, \"percentile_inc_nulls\": 0.14439082145690918, \"value_count\": 69, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 69, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.05761677026748657, \"percentile_inc_nulls\": 0.14174145460128784, \"value_count\": 67, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 134, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0562056303024292, \"percentile_inc_nulls\": 0.14045631885528564, \"value_count\": 65, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 65, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.054816246032714844, \"percentile_inc_nulls\": 0.13919097185134888, \"value_count\": 64, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 64, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.05344855785369873, \"percentile_inc_nulls\": 0.13794535398483276, \"value_count\": 63, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 63, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.05210256576538086, \"percentile_inc_nulls\": 0.13671952486038208, \"value_count\": 62, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 62, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.04949742555618286, \"percentile_inc_nulls\": 0.13434696197509766, \"value_count\": 60, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 120, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.04821658134460449, \"percentile_inc_nulls\": 0.1331804394721985, \"value_count\": 59, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 59, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.04702258110046387, \"percentile_inc_nulls\": 0.13209301233291626, \"value_count\": 55, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 55, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.04595881700515747, \"percentile_inc_nulls\": 0.13112419843673706, \"value_count\": 49, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 49, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.04493844509124756, \"percentile_inc_nulls\": 0.13019496202468872, \"value_count\": 47, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 47, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.043939828872680664, \"percentile_inc_nulls\": 0.12928545475006104, \"value_count\": 46, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 46, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.042984604835510254, \"percentile_inc_nulls\": 0.1284155249595642, \"value_count\": 44, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 44, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.042051076889038086, \"percentile_inc_nulls\": 0.12756532430648804, \"value_count\": 43, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 43, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.040227532386779785, \"percentile_inc_nulls\": 0.12590456008911133, \"value_count\": 42, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 84, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.03933745622634888, \"percentile_inc_nulls\": 0.12509393692016602, \"value_count\": 41, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 41, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.03846907615661621, \"percentile_inc_nulls\": 0.12430304288864136, \"value_count\": 40, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 40, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.03764408826828003, \"percentile_inc_nulls\": 0.12355172634124756, \"value_count\": 38, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 38, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0329548716545105, \"percentile_inc_nulls\": 0.1192811131477356, \"value_count\": 36, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 216, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.03221672773361206, \"percentile_inc_nulls\": 0.11860889196395874, \"value_count\": 34, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 34, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.03150033950805664, \"percentile_inc_nulls\": 0.11795639991760254, \"value_count\": 33, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 33, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.03015434741973877, \"percentile_inc_nulls\": 0.11673057079315186, \"value_count\": 31, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 62, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.028851807117462158, \"percentile_inc_nulls\": 0.11554431915283203, \"value_count\": 30, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 60, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.02759265899658203, \"percentile_inc_nulls\": 0.11439758539199829, \"value_count\": 29, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 58, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.026984810829162598, \"percentile_inc_nulls\": 0.11384397745132446, \"value_count\": 28, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 28, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.02581244707107544, \"percentile_inc_nulls\": 0.11277627944946289, \"value_count\": 27, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 54, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.024683594703674316, \"percentile_inc_nulls\": 0.11174821853637695, \"value_count\": 26, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 52, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.02414083480834961, \"percentile_inc_nulls\": 0.11125391721725464, \"value_count\": 25, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 25, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.02361983060836792, \"percentile_inc_nulls\": 0.11077940464019775, \"value_count\": 24, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 24, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.023120522499084473, \"percentile_inc_nulls\": 0.11032462120056152, \"value_count\": 23, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 23, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.02266460657119751, \"percentile_inc_nulls\": 0.10990947484970093, \"value_count\": 21, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 21, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.022230446338653564, \"percentile_inc_nulls\": 0.10951399803161621, \"value_count\": 20, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 20, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.020580530166625977, \"percentile_inc_nulls\": 0.10801136493682861, \"value_count\": 19, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 76, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.019408226013183594, \"percentile_inc_nulls\": 0.10694372653961182, \"value_count\": 18, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 54, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.019039154052734375, \"percentile_inc_nulls\": 0.10660761594772339, \"value_count\": 17, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 17, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.01799708604812622, \"percentile_inc_nulls\": 0.10565859079360962, \"value_count\": 16, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 48, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.01702016592025757, \"percentile_inc_nulls\": 0.10476887226104736, \"value_count\": 15, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 45, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.016412317752838135, \"percentile_inc_nulls\": 0.10421526432037354, \"value_count\": 14, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 28, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.015565633773803711, \"percentile_inc_nulls\": 0.1034441590309143, \"value_count\": 13, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 39, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.014784097671508789, \"percentile_inc_nulls\": 0.10273241996765137, \"value_count\": 12, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 36, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.014545321464538574, \"percentile_inc_nulls\": 0.10251492261886597, \"value_count\": 11, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 11, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.014111101627349854, \"percentile_inc_nulls\": 0.10211950540542603, \"value_count\": 10, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 20, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.013524949550628662, \"percentile_inc_nulls\": 0.10158568620681763, \"value_count\": 9, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 27, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.011961877346038818, \"percentile_inc_nulls\": 0.10016214847564697, \"value_count\": 8, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 72, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.010594189167022705, \"percentile_inc_nulls\": 0.09891653060913086, \"value_count\": 7, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 63, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.010333657264709473, \"percentile_inc_nulls\": 0.09867924451828003, \"value_count\": 6, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 12, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.008488357067108154, \"percentile_inc_nulls\": 0.09699869155883789, \"value_count\": 5, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 85, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.006751596927642822, \"percentile_inc_nulls\": 0.09541696310043335, \"value_count\": 4, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 80, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.004797756671905518, \"percentile_inc_nulls\": 0.09363752603530884, \"value_count\": 3, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 90, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0024966001510620117, \"percentile_inc_nulls\": 0.09154176712036133, \"value_count\": 2, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 106, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0, \"percentile_inc_nulls\": 0.08926808834075928, \"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 115, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.9620519876480103, \"percentile_inc_nulls\": 0.96543949842453, \"value_count\": 1748, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1748, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.9272083640098572, \"percentile_inc_nulls\": 0.9337063431739807, \"value_count\": 1605, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1605, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.8945791721343994, \"percentile_inc_nulls\": 0.903989851474762, \"value_count\": 1503, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1503, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.8677246570587158, \"percentile_inc_nulls\": 0.8795325756072998, \"value_count\": 1237, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1237, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.8411957621574402, \"percentile_inc_nulls\": 0.85537189245224, \"value_count\": 1222, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1222, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.8200073838233948, \"percentile_inc_nulls\": 0.836074948310852, \"value_count\": 976, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 976, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.7988624572753906, \"percentile_inc_nulls\": 0.816817581653595, \"value_count\": 974, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 974, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.7782385349273682, \"percentile_inc_nulls\": 0.7980347275733948, \"value_count\": 950, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 950, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.7581573128700256, \"percentile_inc_nulls\": 0.7797461152076721, \"value_count\": 925, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 925, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.7402036190032959, \"percentile_inc_nulls\": 0.7633951306343079, \"value_count\": 827, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 827, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.7224019169807434, \"percentile_inc_nulls\": 0.7471826076507568, \"value_count\": 820, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 820, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.7047739028930664, \"percentile_inc_nulls\": 0.7311281561851501, \"value_count\": 812, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 812, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6874281167984009, \"percentile_inc_nulls\": 0.7153307795524597, \"value_count\": 799, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 799, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6718190312385559, \"percentile_inc_nulls\": 0.7011151313781738, \"value_count\": 719, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 719, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6564704775810242, \"percentile_inc_nulls\": 0.687136709690094, \"value_count\": 707, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 707, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6416212320327759, \"percentile_inc_nulls\": 0.6736130714416504, \"value_count\": 684, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 684, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6276403665542603, \"percentile_inc_nulls\": 0.6608802080154419, \"value_count\": 644, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 644, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6140068769454956, \"percentile_inc_nulls\": 0.6484637260437012, \"value_count\": 628, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 628, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.6006121635437012, \"percentile_inc_nulls\": 0.6362648010253906, \"value_count\": 617, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 617, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5745174884796143, \"percentile_inc_nulls\": 0.612499475479126, \"value_count\": 601, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 1202, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5632286071777344, \"percentile_inc_nulls\": 0.6022183895111084, \"value_count\": 520, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 520, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5519614219665527, \"percentile_inc_nulls\": 0.5919569730758667, \"value_count\": 519, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 519, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5411719083786011, \"percentile_inc_nulls\": 0.5821305513381958, \"value_count\": 497, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 497, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5307079553604126, \"percentile_inc_nulls\": 0.5726007223129272, \"value_count\": 482, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 482, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5206781625747681, \"percentile_inc_nulls\": 0.5634663105010986, \"value_count\": 462, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 462, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.5006621479988098, \"percentile_inc_nulls\": 0.5452370643615723, \"value_count\": 461, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 922, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.4908711910247803, \"percentile_inc_nulls\": 0.536320149898529, \"value_count\": 451, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 451, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.4815361499786377, \"percentile_inc_nulls\": 0.5278184413909912, \"value_count\": 430, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 430, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.47239649295806885, \"percentile_inc_nulls\": 0.5194946527481079, \"value_count\": 421, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 421, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.463473916053772, \"percentile_inc_nulls\": 0.5113685727119446, \"value_count\": 411, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 411, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.45459479093551636, \"percentile_inc_nulls\": 0.5032820701599121, \"value_count\": 409, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 409, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.4458025097846985, \"percentile_inc_nulls\": 0.4952746033668518, \"value_count\": 405, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 405, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.4373575448989868, \"percentile_inc_nulls\": 0.4875835180282593, \"value_count\": 389, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 389, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.4289342761039734, \"percentile_inc_nulls\": 0.4799122214317322, \"value_count\": 388, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 388, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.4205544590950012, \"percentile_inc_nulls\": 0.47228044271469116, \"value_count\": 386, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 386, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.41258710622787476, \"percentile_inc_nulls\": 0.46502429246902466, \"value_count\": 367, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 367, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.40464144945144653, \"percentile_inc_nulls\": 0.45778799057006836, \"value_count\": 366, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 366, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.39678269624710083, \"percentile_inc_nulls\": 0.45063072443008423, \"value_count\": 362, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 362, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3893146514892578, \"percentile_inc_nulls\": 0.44382935762405396, \"value_count\": 344, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 344, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3819117546081543, \"percentile_inc_nulls\": 0.4370872974395752, \"value_count\": 341, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 341, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3745739459991455, \"percentile_inc_nulls\": 0.43040454387664795, \"value_count\": 338, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 338, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.36732304096221924, \"percentile_inc_nulls\": 0.42380088567733765, \"value_count\": 334, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 334, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.36039769649505615, \"percentile_inc_nulls\": 0.4174937605857849, \"value_count\": 319, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 319, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3535592555999756, \"percentile_inc_nulls\": 0.4112657904624939, \"value_count\": 315, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 315, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.34678590297698975, \"percentile_inc_nulls\": 0.4050970673561096, \"value_count\": 312, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 312, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3400343060493469, \"percentile_inc_nulls\": 0.39894813299179077, \"value_count\": 311, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 311, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3333261013031006, \"percentile_inc_nulls\": 0.3928387761116028, \"value_count\": 309, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 309, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.32679158449172974, \"percentile_inc_nulls\": 0.3868875503540039, \"value_count\": 301, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 301, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.32030045986175537, \"percentile_inc_nulls\": 0.3809759020805359, \"value_count\": 299, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 299, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.3138744831085205, \"percentile_inc_nulls\": 0.3751235604286194, \"value_count\": 296, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 296, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.30751359462738037, \"percentile_inc_nulls\": 0.3693305253982544, \"value_count\": 293, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 293, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.30128300189971924, \"percentile_inc_nulls\": 0.3636561632156372, \"value_count\": 287, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 287, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.2950741648674011, \"percentile_inc_nulls\": 0.3580015301704407, \"value_count\": 286, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 286, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.28903889656066895, \"percentile_inc_nulls\": 0.35250502824783325, \"value_count\": 278, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 278, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.28302544355392456, \"percentile_inc_nulls\": 0.34702837467193604, \"value_count\": 277, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 277, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.27117210626602173, \"percentile_inc_nulls\": 0.33623313903808594, \"value_count\": 273, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 546, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.25940561294555664, \"percentile_inc_nulls\": 0.3255169987678528, \"value_count\": 271, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 542, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.25356578826904297, \"percentile_inc_nulls\": 0.3201984763145447, \"value_count\": 269, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 269, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.24776935577392578, \"percentile_inc_nulls\": 0.31491953134536743, \"value_count\": 267, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 267, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.24208152294158936, \"percentile_inc_nulls\": 0.3097394108772278, \"value_count\": 262, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 262, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.23645877838134766, \"percentile_inc_nulls\": 0.30461859703063965, \"value_count\": 259, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 259, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.23096626996994019, \"percentile_inc_nulls\": 0.2996164560317993, \"value_count\": 253, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 253, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.22551721334457397, \"percentile_inc_nulls\": 0.2946537733078003, \"value_count\": 251, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 251, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.22019845247268677, \"percentile_inc_nulls\": 0.28980982303619385, \"value_count\": 245, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 245, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.21496647596359253, \"percentile_inc_nulls\": 0.28504490852355957, \"value_count\": 241, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 241, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.20977789163589478, \"percentile_inc_nulls\": 0.2803195118904114, \"value_count\": 239, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 239, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.20474135875701904, \"percentile_inc_nulls\": 0.27573251724243164, \"value_count\": 232, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 232, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.19974815845489502, \"percentile_inc_nulls\": 0.27118510007858276, \"value_count\": 230, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 230, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.194776713848114, \"percentile_inc_nulls\": 0.26665741205215454, \"value_count\": 229, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 229, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.18989211320877075, \"percentile_inc_nulls\": 0.26220887899398804, \"value_count\": 225, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 225, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.18505090475082397, \"percentile_inc_nulls\": 0.25779980421066284, \"value_count\": 223, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 223, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.18053537607192993, \"percentile_inc_nulls\": 0.25368738174438477, \"value_count\": 208, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 208, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.17604148387908936, \"percentile_inc_nulls\": 0.24959468841552734, \"value_count\": 207, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 207, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.17161279916763306, \"percentile_inc_nulls\": 0.24556130170822144, \"value_count\": 204, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 204, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.167205810546875, \"percentile_inc_nulls\": 0.24154770374298096, \"value_count\": 203, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 203, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.16308099031448364, \"percentile_inc_nulls\": 0.23779112100601196, \"value_count\": 190, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 190, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.1589779257774353, \"percentile_inc_nulls\": 0.2340543270111084, \"value_count\": 189, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 189, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.1550702452659607, \"percentile_inc_nulls\": 0.23049545288085938, \"value_count\": 180, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 180, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.14747196435928345, \"percentile_inc_nulls\": 0.22357547283172607, \"value_count\": 175, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 350, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.1437596082687378, \"percentile_inc_nulls\": 0.22019457817077637, \"value_count\": 171, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 171, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.14013415575027466, \"percentile_inc_nulls\": 0.21689271926879883, \"value_count\": 167, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 167, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.13327401876449585, \"percentile_inc_nulls\": 0.21064496040344238, \"value_count\": 158, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 316, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.12997418642044067, \"percentile_inc_nulls\": 0.2076396942138672, \"value_count\": 152, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 152, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.12695658206939697, \"percentile_inc_nulls\": 0.20489144325256348, \"value_count\": 139, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 139, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.1240692138671875, \"percentile_inc_nulls\": 0.20226186513900757, \"value_count\": 133, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 133, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.12129038572311401, \"percentile_inc_nulls\": 0.19973111152648926, \"value_count\": 128, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 128, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.11853331327438354, \"percentile_inc_nulls\": 0.19722014665603638, \"value_count\": 127, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 127, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.11579793691635132, \"percentile_inc_nulls\": 0.19472891092300415, \"value_count\": 126, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 126, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.11312764883041382, \"percentile_inc_nulls\": 0.1922970414161682, \"value_count\": 123, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 123, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.11047911643981934, \"percentile_inc_nulls\": 0.18988490104675293, \"value_count\": 122, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 122, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.1078522801399231, \"percentile_inc_nulls\": 0.18749260902404785, \"value_count\": 121, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 121, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.1052471399307251, \"percentile_inc_nulls\": 0.18511998653411865, \"value_count\": 120, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 120, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.10268545150756836, \"percentile_inc_nulls\": 0.1827870011329651, \"value_count\": 118, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 118, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.10018885135650635, \"percentile_inc_nulls\": 0.18051326274871826, \"value_count\": 115, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 115, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0977357029914856, \"percentile_inc_nulls\": 0.1782791018486023, \"value_count\": 113, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 113, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.09532594680786133, \"percentile_inc_nulls\": 0.1760844588279724, \"value_count\": 111, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 111, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.09298133850097656, \"percentile_inc_nulls\": 0.17394912242889404, \"value_count\": 108, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 108, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0907018780708313, \"percentile_inc_nulls\": 0.17187315225601196, \"value_count\": 105, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 105, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.08844411373138428, \"percentile_inc_nulls\": 0.16981691122055054, \"value_count\": 104, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 104, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.08627313375473022, \"percentile_inc_nulls\": 0.16783976554870605, \"value_count\": 100, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 100, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.08414560556411743, \"percentile_inc_nulls\": 0.16590219736099243, \"value_count\": 98, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 98, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.0820615291595459, \"percentile_inc_nulls\": 0.16400408744812012, \"value_count\": 96, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 96, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.08002084493637085, \"percentile_inc_nulls\": 0.16214561462402344, \"value_count\": 94, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 94, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.07802355289459229, \"percentile_inc_nulls\": 0.16032660007476807, \"value_count\": 92, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 92, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 0.076091468334198, \"percentile_inc_nulls\": 0.15856695175170898, \"value_count\": 89, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 89, \"distinct_value_count\": 447}, {\"percentile_ex_nulls\": 1.0, \"percentile_inc_nulls\": 1.0, \"value_count\": 88, \"group_name\": \"substr_surname_1_2_\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"sum_tokens_in_value_count_group\": 88, \"distinct_value_count\": 447}]}, \"encoding\": {\"tooltip\": [{\"field\": \"value_count\", \"type\": \"quantitative\"}, {\"field\": \"percentile_ex_nulls\", \"type\": \"quantitative\"}, {\"field\": \"percentile_inc_nulls\", \"type\": \"quantitative\"}, {\"field\": \"total_non_null_rows\", \"type\": \"quantitative\"}, {\"field\": \"total_rows_inc_nulls\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"percentile_ex_nulls\", \"sort\": \"descending\", \"title\": \"Percentile\", \"type\": \"quantitative\"}, \"y\": {\"field\": \"value_count\", \"title\": \"Count of values\", \"type\": \"quantitative\"}}, \"title\": {\"text\": \"Distribution of counts of values in column substr(surname,1,2)\", \"subtitle\": \"In this col, 4,515 values (8.9%) are null and there are 447 distinct values\"}}, {\"mark\": \"bar\", \"data\": {\"values\": [{\"value_count\": 1748, \"group_name\": \"substr_surname_1_2_\", \"value\": \"ba\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1605, \"group_name\": \"substr_surname_1_2_\", \"value\": \"ma\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1503, \"group_name\": \"substr_surname_1_2_\", \"value\": \"ha\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1237, \"group_name\": \"substr_surname_1_2_\", \"value\": \"st\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1222, \"group_name\": \"substr_surname_1_2_\", \"value\": \"br\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 976, \"group_name\": \"substr_surname_1_2_\", \"value\": \"co\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 974, \"group_name\": \"substr_surname_1_2_\", \"value\": \"be\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 950, \"group_name\": \"substr_surname_1_2_\", \"value\": \"ro\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 925, \"group_name\": \"substr_surname_1_2_\", \"value\": \"wa\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 827, \"group_name\": \"substr_surname_1_2_\", \"value\": \"ca\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}]}, \"encoding\": {\"tooltip\": [{\"field\": \"value\", \"type\": \"nominal\"}, {\"field\": \"value_count\", \"type\": \"quantitative\"}, {\"field\": \"total_non_null_rows\", \"type\": \"quantitative\"}, {\"field\": \"total_rows_inc_nulls\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"value\", \"sort\": \"-y\", \"title\": null, \"type\": \"nominal\"}, \"y\": {\"field\": \"value_count\", \"title\": \"Value count\", \"type\": \"quantitative\"}}, \"title\": \"Top 10 values by value count\"}, {\"mark\": \"bar\", \"data\": {\"values\": [{\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"2r\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"kk\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"kb\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"qh\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"kh\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"d0\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"xi\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"ft\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"3a\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}, {\"value_count\": 1, \"group_name\": \"substr_surname_1_2_\", \"value\": \"g0\", \"total_non_null_rows\": 46063, \"total_rows_inc_nulls\": 50578, \"distinct_value_count\": 447}]}, \"encoding\": {\"tooltip\": [{\"field\": \"value\", \"type\": \"nominal\"}, {\"field\": \"value_count\", \"type\": \"quantitative\"}, {\"field\": \"total_non_null_rows\", \"type\": \"quantitative\"}, {\"field\": \"total_rows_inc_nulls\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"value\", \"sort\": \"-y\", \"title\": null, \"type\": \"nominal\"}, \"y\": {\"field\": \"value_count\", \"scale\": {\"domain\": [0, 1748]}, \"title\": \"Value count\", \"type\": \"quantitative\"}}, \"title\": \"Bottom 10 values by value count\"}]}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.9.3.json\"}, {\"mode\": \"vega-lite\"});\n",
              "</script>"
            ],
            "text/plain": [
              "alt.VConcatChart(...)"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from splink.exploratory import profile_columns\n",
        "\n",
        "profile_columns(df, db_api, column_expressions=[\"first_name\", \"substr(surname,1,2)\"])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "id": "3f8b54a9-5f4a-423b-90ec-d7e93f54f3d9",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "<style>\n",
              "  #altair-viz-8c180af77f784603b8bbb80cd92a8d0e.vega-embed {\n",
              "    width: 100%;\n",
              "    display: flex;\n",
              "  }\n",
              "\n",
              "  #altair-viz-8c180af77f784603b8bbb80cd92a8d0e.vega-embed details,\n",
              "  #altair-viz-8c180af77f784603b8bbb80cd92a8d0e.vega-embed details summary {\n",
              "    position: relative;\n",
              "  }\n",
              "</style>\n",
              "<div id=\"altair-viz-8c180af77f784603b8bbb80cd92a8d0e\"></div>\n",
              "<script type=\"text/javascript\">\n",
              "  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
              "  (function(spec, embedOpt){\n",
              "    let outputDiv = document.currentScript.previousElementSibling;\n",
              "    if (outputDiv.id !== \"altair-viz-8c180af77f784603b8bbb80cd92a8d0e\") {\n",
              "      outputDiv = document.getElementById(\"altair-viz-8c180af77f784603b8bbb80cd92a8d0e\");\n",
              "    }\n",
              "    const paths = {\n",
              "      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n",
              "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n",
              "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.17.0?noext\",\n",
              "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n",
              "    };\n",
              "\n",
              "    function maybeLoadScript(lib, version) {\n",
              "      var key = `${lib.replace(\"-\", \"\")}_version`;\n",
              "      return (VEGA_DEBUG[key] == version) ?\n",
              "        Promise.resolve(paths[lib]) :\n",
              "        new Promise(function(resolve, reject) {\n",
              "          var s = document.createElement('script');\n",
              "          document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
              "          s.async = true;\n",
              "          s.onload = () => {\n",
              "            VEGA_DEBUG[key] = version;\n",
              "            return resolve(paths[lib]);\n",
              "          };\n",
              "          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
              "          s.src = paths[lib];\n",
              "        });\n",
              "    }\n",
              "\n",
              "    function showError(err) {\n",
              "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
              "      throw err;\n",
              "    }\n",
              "\n",
              "    function displayChart(vegaEmbed) {\n",
              "      vegaEmbed(outputDiv, spec, embedOpt)\n",
              "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
              "    }\n",
              "\n",
              "    if(typeof define === \"function\" && define.amd) {\n",
              "      requirejs.config({paths});\n",
              "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
              "    } else {\n",
              "      maybeLoadScript(\"vega\", \"5\")\n",
              "        .then(() => maybeLoadScript(\"vega-lite\", \"5.17.0\"))\n",
              "        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
              "        .catch(showError)\n",
              "        .then(() => displayChart(vegaEmbed));\n",
              "    }\n",
              "  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-0781c8ebb6c383be10fdb76027f3501c\"}, \"mark\": \"bar\", \"encoding\": {\"order\": {\"field\": \"cumulative_rows\"}, \"tooltip\": [{\"field\": \"blocking_rule\", \"title\": \"SQL Condition\", \"type\": \"nominal\"}, {\"field\": \"row_count\", \"format\": \",\", \"title\": \"Comparisons Generated\", \"type\": \"quantitative\"}, {\"field\": \"cumulative_rows\", \"format\": \",\", \"title\": \"Cumulative Comparisons\", \"type\": \"quantitative\"}, {\"field\": \"cartesian\", \"format\": \",\", \"title\": \"Total comparisons in Cartesian product\", \"type\": \"quantitative\"}], \"x\": {\"field\": \"start\", \"title\": \"Comparisons Generated by Rule(s)\", \"type\": \"quantitative\"}, \"x2\": {\"field\": \"cumulative_rows\"}, \"y\": {\"field\": \"blocking_rule\", \"sort\": [\"-x2\"], \"title\": \"SQL Blocking Rule\"}}, \"height\": {\"step\": 20}, \"title\": {\"text\": \"Count of Additional Comparisons Generated by Each Blocking Rule\", \"subtitle\": \"(Counts exclude comparisons already generated by previous rules)\"}, \"width\": 450, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.9.3.json\", \"datasets\": {\"data-0781c8ebb6c383be10fdb76027f3501c\": [{\"blocking_rule\": \"(l.\\\"first_name\\\" = r.\\\"first_name\\\") AND (l.\\\"surname\\\" = r.\\\"surname\\\")\", \"row_count\": 243656, \"cumulative_rows\": 243656, \"cartesian\": 1279041753, \"match_key\": \"0\", \"start\": 0}, {\"blocking_rule\": \"(l.\\\"surname\\\" = r.\\\"surname\\\") AND (l.\\\"dob\\\" = r.\\\"dob\\\")\", \"row_count\": 25041, \"cumulative_rows\": 268697, \"cartesian\": 1279041753, \"match_key\": \"1\", \"start\": 243656}]}}, {\"mode\": \"vega-lite\"});\n",
              "</script>"
            ],
            "text/plain": [
              "alt.Chart(...)"
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from splink.blocking_analysis import (\n",
        "    cumulative_comparisons_to_be_scored_from_blocking_rules_chart,\n",
        ")\n",
        "from splink import block_on\n",
        "\n",
        "cumulative_comparisons_to_be_scored_from_blocking_rules_chart(\n",
        "    table_or_tables=df,\n",
        "    db_api=db_api,\n",
        "    blocking_rules=[block_on(\"first_name\", \"surname\"), block_on(\"surname\", \"dob\")],\n",
        "    link_type=\"dedupe_only\",\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "id": "29cec53c-54f0-45ea-9b40-af2f1d66c788",
      "metadata": {},
      "outputs": [],
      "source": [
        "import splink.comparison_library as cl\n",
        "\n",
        "\n",
        "from splink import Linker, SettingsCreator\n",
        "\n",
        "settings = SettingsCreator(\n",
        "    link_type=\"dedupe_only\",\n",
        "    blocking_rules_to_generate_predictions=[\n",
        "        block_on(\"first_name\", \"surname\"),\n",
        "        block_on(\"surname\", \"dob\"),\n",
        "    ],\n",
        "    comparisons=[\n",
        "        cl.ExactMatch(\"first_name\").configure(term_frequency_adjustments=True),\n",
        "        cl.LevenshteinAtThresholds(\"surname\", [1, 3]),\n",
        "        cl.LevenshteinAtThresholds(\"dob\", [1, 2]),\n",
        "        cl.LevenshteinAtThresholds(\"postcode_fake\", [1, 2]),\n",
        "        cl.ExactMatch(\"birth_place\").configure(term_frequency_adjustments=True),\n",
        "        cl.ExactMatch(\"occupation\").configure(term_frequency_adjustments=True),\n",
        "    ],\n",
        "    retain_intermediate_calculation_columns=True,\n",
        ")\n",
        "\n",
        "linker = Linker(df, settings, db_api=db_api)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "id": "ca798b76-cd39-4890-b842-ba5a0e583050",
      "metadata": {},
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "Probability two random records match is estimated to be  0.000136.\n",
            "This means that amongst all possible pairwise record comparisons, one in 7,362.31 are expected to match.  With 1,279,041,753 total possible comparisons, we expect a total of around 173,728.33 matching pairs\n"
          ]
        }
      ],
      "source": [
        "linker.training.estimate_probability_two_random_records_match(\n",
        "    [\n",
        "        block_on(\"first_name\", \"surname\", \"dob\"),\n",
        "        block_on(\"substr(first_name,1,2)\", \"surname\", \"substr(postcode_fake, 1,2)\"),\n",
        "        block_on(\"dob\", \"postcode_fake\"),\n",
        "    ],\n",
        "    recall=0.6,\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "id": "3ba5c515-629c-490c-b8e4-a63ea242ea0a",
      "metadata": {},
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "----- Estimating u probabilities using random sampling -----\n",
            "\n",
            "Estimated u probabilities using random sampling\n",
            "\n",
            "Your model is not yet fully trained. Missing estimates for:\n",
            "    - first_name (no m values are trained).\n",
            "    - surname (no m values are trained).\n",
            "    - dob (no m values are trained).\n",
            "    - postcode_fake (no m values are trained).\n",
            "    - birth_place (no m values are trained).\n",
            "    - occupation (no m values are trained).\n"
          ]
        }
      ],
      "source": [
        "linker.training.estimate_u_using_random_sampling(max_pairs=5e6)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "id": "ad8c0de1-769a-4861-849d-8b7e6655a681",
      "metadata": {},
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "\n",
            "----- Starting EM training session -----\n",
            "\n",
            "Estimating the m probabilities of the model by blocking on:\n",
            "(l.\"first_name\" = r.\"first_name\") AND (l.\"surname\" = r.\"surname\")\n",
            "\n",
            "Parameter estimates will be made for the following comparison(s):\n",
            "    - dob\n",
            "    - postcode_fake\n",
            "    - birth_place\n",
            "    - occupation\n",
            "\n",
            "Parameter estimates cannot be made for the following comparison(s) since they are used in the blocking rules: \n",
            "    - first_name\n",
            "    - surname\n",
            "\n",
            "Iteration 1: Largest change in params was -0.526 in probability_two_random_records_match\n",
            "Iteration 2: Largest change in params was -0.0321 in probability_two_random_records_match\n",
            "Iteration 3: Largest change in params was 0.0109 in the m_probability of birth_place, level `Exact match on birth_place`\n",
            "Iteration 4: Largest change in params was -0.00341 in the m_probability of birth_place, level `All other comparisons`\n",
            "Iteration 5: Largest change in params was -0.00116 in the m_probability of dob, level `All other comparisons`\n",
            "Iteration 6: Largest change in params was -0.000547 in the m_probability of dob, level `All other comparisons`\n",
            "Iteration 7: Largest change in params was -0.00029 in the m_probability of dob, level `All other comparisons`\n",
            "Iteration 8: Largest change in params was -0.000169 in the m_probability of dob, level `All other comparisons`\n",
            "Iteration 9: Largest change in params was -0.000105 in the m_probability of dob, level `All other comparisons`\n",
            "Iteration 10: Largest change in params was -6.87e-05 in the m_probability of dob, level `All other comparisons`\n",
            "\n",
            "EM converged after 10 iterations\n",
            "\n",
            "Your model is not yet fully trained. Missing estimates for:\n",
            "    - first_name (no m values are trained).\n",
            "    - surname (no m values are trained).\n"
          ]
        }
      ],
      "source": [
        "blocking_rule = block_on(\"first_name\", \"surname\")\n",
        "training_session_names = (\n",
        "    linker.training.estimate_parameters_using_expectation_maximisation(blocking_rule)\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "id": "c44fc676-e57e-4e8c-b9c6-8989e720b03a",
      "metadata": {},
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "\n",
            "----- Starting EM training session -----\n",
            "\n"
          ]
        },
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "Estimating the m probabilities of the model by blocking on:\n",
            "l.\"dob\" = r.\"dob\"\n",
            "\n",
            "Parameter estimates will be made for the following comparison(s):\n",
            "    - first_name\n",
            "    - surname\n",
            "    - postcode_fake\n",
            "    - birth_place\n",
            "    - occupation\n",
            "\n",
            "Parameter estimates cannot be made for the following comparison(s) since they are used in the blocking rules: \n",
            "    - dob\n",
            "\n",
            "Iteration 1: Largest change in params was -0.355 in the m_probability of first_name, level `Exact match on first_name`\n",
            "Iteration 2: Largest change in params was -0.0383 in the m_probability of first_name, level `Exact match on first_name`\n",
            "Iteration 3: Largest change in params was 0.00531 in the m_probability of postcode_fake, level `All other comparisons`\n",
            "Iteration 4: Largest change in params was 0.00129 in the m_probability of postcode_fake, level `All other comparisons`\n",
            "Iteration 5: Largest change in params was 0.00034 in the m_probability of surname, level `All other comparisons`\n",
            "Iteration 6: Largest change in params was 8.9e-05 in the m_probability of surname, level `All other comparisons`\n",
            "\n",
            "EM converged after 6 iterations\n",
            "\n",
            "Your model is fully trained. All comparisons have at least one estimate for their m and u values\n"
          ]
        }
      ],
      "source": [
        "blocking_rule = block_on(\"dob\")\n",
        "training_session_dob = (\n",
        "    linker.training.estimate_parameters_using_expectation_maximisation(blocking_rule)\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "id": "31b6440a-4353-45af-a986-ba59c0d784d3",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "<style>\n",
              "  #altair-viz-c9586004ad19451e93e4cd4bd86f0f54.vega-embed {\n",
              "    width: 100%;\n",
              "    display: flex;\n",
              "  }\n",
              "\n",
              "  #altair-viz-c9586004ad19451e93e4cd4bd86f0f54.vega-embed details,\n",
              "  #altair-viz-c9586004ad19451e93e4cd4bd86f0f54.vega-embed details summary {\n",
              "    position: relative;\n",
              "  }\n",
              "</style>\n",
              "<div id=\"altair-viz-c9586004ad19451e93e4cd4bd86f0f54\"></div>\n",
              "<script type=\"text/javascript\">\n",
              "  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
              "  (function(spec, embedOpt){\n",
              "    let outputDiv = document.currentScript.previousElementSibling;\n",
              "    if (outputDiv.id !== \"altair-viz-c9586004ad19451e93e4cd4bd86f0f54\") {\n",
              "      outputDiv = document.getElementById(\"altair-viz-c9586004ad19451e93e4cd4bd86f0f54\");\n",
              "    }\n",
              "    const paths = {\n",
              "      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n",
              "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n",
              "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.17.0?noext\",\n",
              "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n",
              "    };\n",
              "\n",
              "    function maybeLoadScript(lib, version) {\n",
              "      var key = `${lib.replace(\"-\", \"\")}_version`;\n",
              "      return (VEGA_DEBUG[key] == version) ?\n",
              "        Promise.resolve(paths[lib]) :\n",
              "        new Promise(function(resolve, reject) {\n",
              "          var s = document.createElement('script');\n",
              "          document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
              "          s.async = true;\n",
              "          s.onload = () => {\n",
              "            VEGA_DEBUG[key] = version;\n",
              "            return resolve(paths[lib]);\n",
              "          };\n",
              "          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
              "          s.src = paths[lib];\n",
              "        });\n",
              "    }\n",
              "\n",
              "    function showError(err) {\n",
              "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
              "      throw err;\n",
              "    }\n",
              "\n",
              "    function displayChart(vegaEmbed) {\n",
              "      vegaEmbed(outputDiv, spec, embedOpt)\n",
              "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
              "    }\n",
              "\n",
              "    if(typeof define === \"function\" && define.amd) {\n",
              "      requirejs.config({paths});\n",
              "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
              "    } else {\n",
              "      maybeLoadScript(\"vega\", \"5\")\n",
              "        .then(() => maybeLoadScript(\"vega-lite\", \"5.17.0\"))\n",
              "        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
              "        .catch(showError)\n",
              "        .then(() => displayChart(vegaEmbed));\n",
              "    }\n",
              "  })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300, \"discreteHeight\": 60, \"discreteWidth\": 400}, \"header\": {\"title\": null}, \"mark\": {\"tooltip\": null}, \"title\": {\"anchor\": \"middle\"}}, \"vconcat\": [{\"mark\": {\"type\": \"bar\", \"clip\": true, \"height\": 15}, \"encoding\": {\"color\": {\"field\": \"log2_bayes_factor\", \"scale\": {\"domain\": [-10, 0, 10], \"interpolate\": \"lab\", \"range\": [\"red\", \"#bbbbbb\", \"green\"]}, \"title\": \"Match weight\", \"type\": \"quantitative\"}, \"tooltip\": [{\"field\": \"comparison_name\", \"title\": \"Comparison name\", \"type\": \"nominal\"}, {\"field\": \"probability_two_random_records_match\", \"format\": \".4f\", \"title\": \"Probability two random records match\", \"type\": \"nominal\"}, {\"field\": \"log2_bayes_factor\", \"format\": \",.4f\", \"title\": \"Equivalent match weight\", \"type\": \"quantitative\"}, {\"field\": \"bayes_factor_description\", \"title\": \"Match weight description\", \"type\": \"nominal\"}], \"x\": {\"axis\": {\"domain\": false, \"gridColor\": {\"condition\": {\"test\": \"abs(datum.value / 10)  <= 1 & datum.value % 10 === 0\", \"value\": \"#aaa\"}, \"value\": \"#ddd\"}, \"gridDash\": {\"condition\": {\"test\": \"abs(datum.value / 10) == 1\", \"value\": [3]}, \"value\": null}, \"gridWidth\": {\"condition\": {\"test\": \"abs(datum.value / 10)  <= 1 & datum.value % 10 === 0\", \"value\": 2}, \"value\": 1}, \"labels\": false, \"ticks\": false, \"title\": \"\"}, \"field\": \"log2_bayes_factor\", \"scale\": {\"domain\": [-13, 13]}, \"type\": \"quantitative\"}, \"y\": {\"axis\": {\"title\": \"Prior (starting) match weight\", \"titleAlign\": \"right\", \"titleAngle\": 0, \"titleFontWeight\": \"normal\"}, \"field\": \"label_for_charts\", \"sort\": {\"field\": \"comparison_vector_value\", \"order\": \"descending\"}, \"type\": \"nominal\"}}, \"height\": 20, \"transform\": [{\"filter\": \"(datum.comparison_name == 'probability_two_random_records_match')\"}]}, {\"mark\": {\"type\": \"bar\", \"clip\": true}, \"encoding\": {\"color\": {\"field\": \"log2_bayes_factor\", \"scale\": {\"domain\": [-10, 0, 10], \"interpolate\": \"lab\", \"range\": [\"red\", \"#bbbbbb\", \"green\"]}, \"title\": \"Match weight\", \"type\": \"quantitative\"}, \"row\": {\"field\": \"comparison_name\", \"header\": {\"labelAlign\": \"left\", \"labelAnchor\": \"middle\", \"labelAngle\": 0}, \"sort\": {\"field\": \"comparison_sort_order\"}, \"type\": \"nominal\"}, \"tooltip\": [{\"field\": \"comparison_name\", \"title\": \"Comparison name\", \"type\": \"nominal\"}, {\"field\": \"label_for_charts\", \"title\": \"Label\", \"type\": \"ordinal\"}, {\"field\": \"sql_condition\", \"title\": \"SQL condition\", \"type\": \"nominal\"}, {\"field\": \"m_probability\", \"format\": \".4f\", \"title\": \"M probability\", \"type\": \"quantitative\"}, {\"field\": \"u_probability\", \"format\": \".4f\", \"title\": \"U probability\", \"type\": \"quantitative\"}, {\"field\": \"bayes_factor\", \"format\": \",.4f\", \"title\": \"Bayes factor = m/u\", \"type\": \"quantitative\"}, {\"field\": \"log2_bayes_factor\", \"format\": \",.4f\", \"title\": \"Match weight = log2(m/u)\", \"type\": \"quantitative\"}, {\"field\": \"bayes_factor_description\", \"title\": \"Match weight description\", \"type\": \"nominal\"}], \"x\": {\"axis\": {\"gridColor\": {\"condition\": {\"test\": \"abs(datum.value / 10)  <= 1 & datum.value % 10 === 0\", \"value\": \"#aaa\"}, \"value\": \"#ddd\"}, \"gridDash\": {\"condition\": {\"test\": \"abs(datum.value / 10) == 1\", \"value\": [3]}, \"value\": null}, \"gridWidth\": {\"condition\": {\"test\": \"abs(datum.value / 10)  <= 1 & datum.value % 10 === 0\", \"value\": 2}, \"value\": 1}, \"title\": \"Comparison level match weight = log2(m/u)\"}, \"field\": \"log2_bayes_factor\", \"scale\": {\"domain\": [-13, 13]}, \"type\": \"quantitative\"}, \"y\": {\"axis\": {\"title\": null}, \"field\": \"label_for_charts\", \"sort\": {\"field\": \"comparison_vector_value\", \"order\": \"descending\"}, \"type\": \"nominal\"}}, \"height\": {\"step\": 12}, \"resolve\": {\"axis\": {\"y\": \"independent\"}, \"scale\": {\"y\": \"independent\"}}, \"transform\": [{\"filter\": \"(datum.comparison_name != 'probability_two_random_records_match')\"}]}], \"data\": {\"name\": \"data-c6b88fe1e55cd8504dee80724272ab03\"}, \"params\": [{\"name\": \"mouse_zoom\", \"select\": {\"type\": \"interval\", \"encodings\": [\"x\"]}, \"bind\": \"scales\", \"views\": []}], \"resolve\": {\"axis\": {\"y\": \"independent\"}, \"scale\": {\"y\": \"independent\"}}, \"title\": {\"text\": \"Model parameters (components of final match weight)\", \"subtitle\": \"Use mousewheel to zoom\"}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.9.3.json\", \"datasets\": {\"data-c6b88fe1e55cd8504dee80724272ab03\": [{\"comparison_name\": \"probability_two_random_records_match\", \"sql_condition\": null, \"label_for_charts\": \"\", \"m_probability\": null, \"u_probability\": null, \"m_probability_description\": null, \"u_probability_description\": null, \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": null, \"is_null_level\": false, \"bayes_factor\": 0.00013584539607096294, \"log2_bayes_factor\": -12.845746707461347, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 0, \"bayes_factor_description\": \"The probability that two random records drawn at random match is 0.000 or one in  7,362.3 records.This is equivalent to a starting match weight of -12.846.\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": -1}, {\"comparison_name\": \"first_name\", \"sql_condition\": \"\\\"first_name_l\\\" = \\\"first_name_r\\\"\", \"label_for_charts\": \"Exact match on first_name\", \"m_probability\": 0.5506790732247939, \"u_probability\": 0.012901382443454592, \"m_probability_description\": \"Amongst matching record comparisons, 55.07% of records are in the exact match on first_name comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 1.29% of records are in the exact match on first_name comparison level\", \"has_tf_adjustments\": true, \"tf_adjustment_column\": \"first_name\", \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 42.68372599900535, \"log2_bayes_factor\": 5.41561421401266, \"comparison_vector_value\": 1, \"max_comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on first_name` then comparison is 42.68 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 0}, {\"comparison_name\": \"first_name\", \"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"m_probability_description\": \"Amongst matching record comparisons, 44.93% of records are in the all other comparisons comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 98.71% of records are in the all other comparisons comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.4551935528867936, \"log2_bayes_factor\": -1.1354479706438325, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 0}, {\"comparison_name\": \"surname\", \"sql_condition\": \"\\\"surname_l\\\" = \\\"surname_r\\\"\", \"label_for_charts\": \"Exact match on surname\", \"m_probability\": 0.7798213256103345, \"u_probability\": 0.0007567808197398964, \"m_probability_description\": \"Amongst matching record comparisons, 77.98% of records are in the exact match on surname comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.08% of records are in the exact match on surname comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 1030.4454146688827, \"log2_bayes_factor\": 10.00905236831435, \"comparison_vector_value\": 3, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on surname` then comparison is 1,030.45 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 1}, {\"comparison_name\": \"surname\", \"sql_condition\": \"levenshtein_distance(\\\"surname_l\\\", \\\"surname_r\\\") <= 1\", \"label_for_charts\": \"Levenshtein distance of surname <= 1\", \"m_probability\": 0.12808911387599578, \"u_probability\": 0.00037912419864151236, \"m_probability_description\": \"Amongst matching record comparisons, 12.81% of records are in the levenshtein distance of surname <= 1 comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.04% of records are in the levenshtein distance of surname <= 1 comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 337.8552841917451, \"log2_bayes_factor\": 8.400261609398289, \"comparison_vector_value\": 2, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of surname <= 1` then comparison is 337.86 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 1}, {\"comparison_name\": \"surname\", \"sql_condition\": \"levenshtein_distance(\\\"surname_l\\\", \\\"surname_r\\\") <= 3\", \"label_for_charts\": \"Levenshtein distance of surname <= 3\", \"m_probability\": 0.027582225705042273, \"u_probability\": 0.017783860071373187, \"m_probability_description\": \"Amongst matching record comparisons, 2.76% of records are in the levenshtein distance of surname <= 3 comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 1.78% of records are in the levenshtein distance of surname <= 3 comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 1.550969564219727, \"log2_bayes_factor\": 0.6331703756195893, \"comparison_vector_value\": 1, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of surname <= 3` then comparison is 1.55 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 1}, {\"comparison_name\": \"surname\", \"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.06450733480862748, \"u_probability\": 0.9810802349102454, \"m_probability_description\": \"Amongst matching record comparisons, 6.45% of records are in the all other comparisons comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 98.11% of records are in the all other comparisons comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.06575133461385956, \"log2_bayes_factor\": -3.926836011396319, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  15.21 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 1}, {\"comparison_name\": \"dob\", \"sql_condition\": \"\\\"dob_l\\\" = \\\"dob_r\\\"\", \"label_for_charts\": \"Exact match on dob\", \"m_probability\": 0.6198138543741121, \"u_probability\": 0.0019758263205186424, \"m_probability_description\": \"Amongst matching record comparisons, 61.98% of records are in the exact match on dob comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.20% of records are in the exact match on dob comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 313.6985513035451, \"log2_bayes_factor\": 8.293235056437856, \"comparison_vector_value\": 3, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on dob` then comparison is 313.70 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 2}, {\"comparison_name\": \"dob\", \"sql_condition\": \"levenshtein_distance(\\\"dob_l\\\", \\\"dob_r\\\") <= 1\", \"label_for_charts\": \"Levenshtein distance of dob <= 1\", \"m_probability\": 0.342688305586749, \"u_probability\": 0.019056406632372083, \"m_probability_description\": \"Amongst matching record comparisons, 34.27% of records are in the levenshtein distance of dob <= 1 comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 1.91% of records are in the levenshtein distance of dob <= 1 comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 17.982839692589632, \"log2_bayes_factor\": 4.16854895149798, \"comparison_vector_value\": 2, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of dob <= 1` then comparison is 17.98 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 2}, {\"comparison_name\": \"dob\", \"sql_condition\": \"levenshtein_distance(\\\"dob_l\\\", \\\"dob_r\\\") <= 2\", \"label_for_charts\": \"Levenshtein distance of dob <= 2\", \"m_probability\": 0.03721225022777856, \"u_probability\": 0.07550582672079083, \"m_probability_description\": \"Amongst matching record comparisons, 3.72% of records are in the levenshtein distance of dob <= 2 comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 7.55% of records are in the levenshtein distance of dob <= 2 comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.49283945152185216, \"log2_bayes_factor\": -1.0208103473025933, \"comparison_vector_value\": 1, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of dob <= 2` then comparison is  2.03 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 2}, {\"comparison_name\": \"dob\", \"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.00028558981136024217, \"u_probability\": 0.9034619403263184, \"m_probability_description\": \"Amongst matching record comparisons, 0.03% of records are in the all other comparisons comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 90.35% of records are in the all other comparisons comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.000316106078864917, \"log2_bayes_factor\": -11.62730360035277, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  3,163.50 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 2}, {\"comparison_name\": \"postcode_fake\", \"sql_condition\": \"\\\"postcode_fake_l\\\" = \\\"postcode_fake_r\\\"\", \"label_for_charts\": \"Exact match on postcode_fake\", \"m_probability\": 0.687726663068843, \"u_probability\": 0.00015071615069623225, \"m_probability_description\": \"Amongst matching record comparisons, 68.77% of records are in the exact match on postcode_fake comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.02% of records are in the exact match on postcode_fake comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 4563.058835379581, \"log2_bayes_factor\": 12.155785540453788, \"comparison_vector_value\": 3, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on postcode_fake` then comparison is 4,563.06 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 3}, {\"comparison_name\": \"postcode_fake\", \"sql_condition\": \"levenshtein_distance(\\\"postcode_fake_l\\\", \\\"postcode_fake_r\\\") <= 1\", \"label_for_charts\": \"Levenshtein distance of postcode_fake <= 1\", \"m_probability\": 0.08663919768424605, \"u_probability\": 7.900978825044774e-05, \"m_probability_description\": \"Amongst matching record comparisons, 8.66% of records are in the levenshtein distance of postcode_fake <= 1 comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.01% of records are in the levenshtein distance of postcode_fake <= 1 comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 1096.5628386398703, \"log2_bayes_factor\": 10.098772772821805, \"comparison_vector_value\": 2, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of postcode_fake <= 1` then comparison is 1,096.56 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 3}, {\"comparison_name\": \"postcode_fake\", \"sql_condition\": \"levenshtein_distance(\\\"postcode_fake_l\\\", \\\"postcode_fake_r\\\") <= 2\", \"label_for_charts\": \"Levenshtein distance of postcode_fake <= 2\", \"m_probability\": 0.05609324905133215, \"u_probability\": 0.0005148915192287583, \"m_probability_description\": \"Amongst matching record comparisons, 5.61% of records are in the levenshtein distance of postcode_fake <= 2 comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.05% of records are in the levenshtein distance of postcode_fake <= 2 comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 108.94187796169702, \"log2_bayes_factor\": 6.767414831743295, \"comparison_vector_value\": 1, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of postcode_fake <= 2` then comparison is 108.94 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 3}, {\"comparison_name\": \"postcode_fake\", \"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.1695408901955787, \"u_probability\": 0.9992553825418246, \"m_probability_description\": \"Amongst matching record comparisons, 16.95% of records are in the all other comparisons comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 99.93% of records are in the all other comparisons comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.16966722737516246, \"log2_bayes_factor\": -2.559220171547059, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  5.89 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 3}, {\"comparison_name\": \"birth_place\", \"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Exact match on birth_place\", \"m_probability\": 0.8462634396206725, \"u_probability\": 0.005265650721787716, \"m_probability_description\": \"Amongst matching record comparisons, 84.63% of records are in the exact match on birth_place comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 0.53% of records are in the exact match on birth_place comparison level\", \"has_tf_adjustments\": true, \"tf_adjustment_column\": \"birth_place\", \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 160.7139334401887, \"log2_bayes_factor\": 7.328351201760086, \"comparison_vector_value\": 1, \"max_comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on birth_place` then comparison is 160.71 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 4}, {\"comparison_name\": \"birth_place\", \"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.15373656037932748, \"u_probability\": 0.9947343492782122, \"m_probability_description\": \"Amongst matching record comparisons, 15.37% of records are in the all other comparisons comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 99.47% of records are in the all other comparisons comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.15455036863950666, \"log2_bayes_factor\": -2.693850999515206, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  6.47 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 4}, {\"comparison_name\": \"occupation\", \"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Exact match on occupation\", \"m_probability\": 0.8993324609099845, \"u_probability\": 0.03924326946398254, \"m_probability_description\": \"Amongst matching record comparisons, 89.93% of records are in the exact match on occupation comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 3.92% of records are in the exact match on occupation comparison level\", \"has_tf_adjustments\": true, \"tf_adjustment_column\": \"occupation\", \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 22.91685869179151, \"log2_bayes_factor\": 4.518337396383288, \"comparison_vector_value\": 1, \"max_comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on occupation` then comparison is 22.92 times more likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 5}, {\"comparison_name\": \"occupation\", \"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.10066753909001541, \"u_probability\": 0.9607567305360175, \"m_probability_description\": \"Amongst matching record comparisons, 10.07% of records are in the all other comparisons comparison level\", \"u_probability_description\": \"Amongst non-matching record comparisons, 96.08% of records are in the all other comparisons comparison level\", \"has_tf_adjustments\": false, \"tf_adjustment_column\": null, \"tf_adjustment_weight\": 1.0, \"is_null_level\": false, \"bayes_factor\": 0.10477942635265412, \"log2_bayes_factor\": -3.254572626225918, \"comparison_vector_value\": 0, \"max_comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  9.54 times less likely to be a match\", \"probability_two_random_records_match\": 0.00013582694460587586, \"comparison_sort_order\": 5}]}}, {\"mode\": \"vega-lite\"});\n",
              "</script>"
            ],
            "text/plain": [
              "alt.VConcatChart(...)"
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "linker.visualisations.match_weights_chart()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "id": "e9b076af-b956-4e85-abfa-5c45d92a3cac",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "<style>\n",
              "  #altair-viz-20c4ad8950ba440583f12c500cbaaf3d.vega-embed {\n",
              "    width: 100%;\n",
              "    display: flex;\n",
              "  }\n",
              "\n",
              "  #altair-viz-20c4ad8950ba440583f12c500cbaaf3d.vega-embed details,\n",
              "  #altair-viz-20c4ad8950ba440583f12c500cbaaf3d.vega-embed details summary {\n",
              "    position: relative;\n",
              "  }\n",
              "</style>\n",
              "<div id=\"altair-viz-20c4ad8950ba440583f12c500cbaaf3d\"></div>\n",
              "<script type=\"text/javascript\">\n",
              "  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
              "  (function(spec, embedOpt){\n",
              "    let outputDiv = document.currentScript.previousElementSibling;\n",
              "    if (outputDiv.id !== \"altair-viz-20c4ad8950ba440583f12c500cbaaf3d\") {\n",
              "      outputDiv = document.getElementById(\"altair-viz-20c4ad8950ba440583f12c500cbaaf3d\");\n",
              "    }\n",
              "    const paths = {\n",
              "      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n",
              "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n",
              "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.17.0?noext\",\n",
              "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n",
              "    };\n",
              "\n",
              "    function maybeLoadScript(lib, version) {\n",
              "      var key = `${lib.replace(\"-\", \"\")}_version`;\n",
              "      return (VEGA_DEBUG[key] == version) ?\n",
              "        Promise.resolve(paths[lib]) :\n",
              "        new Promise(function(resolve, reject) {\n",
              "          var s = document.createElement('script');\n",
              "          document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
              "          s.async = true;\n",
              "          s.onload = () => {\n",
              "            VEGA_DEBUG[key] = version;\n",
              "            return resolve(paths[lib]);\n",
              "          };\n",
              "          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
              "          s.src = paths[lib];\n",
              "        });\n",
              "    }\n",
              "\n",
              "    function showError(err) {\n",
              "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
              "      throw err;\n",
              "    }\n",
              "\n",
              "    function displayChart(vegaEmbed) {\n",
              "      vegaEmbed(outputDiv, spec, embedOpt)\n",
              "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
              "    }\n",
              "\n",
              "    if(typeof define === \"function\" && define.amd) {\n",
              "      requirejs.config({paths});\n",
              "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
              "    } else {\n",
              "      maybeLoadScript(\"vega\", \"5\")\n",
              "        .then(() => maybeLoadScript(\"vega-lite\", \"5.17.0\"))\n",
              "        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
              "        .catch(showError)\n",
              "        .then(() => displayChart(vegaEmbed));\n",
              "    }\n",
              "  })({\"config\": {\"view\": {\"continuousWidth\": 400, \"continuousHeight\": 300}}, \"layer\": [{\"mark\": {\"type\": \"line\"}, \"encoding\": {\"x\": {\"axis\": {\"format\": \"+\", \"title\": \"Threshold match weight\"}, \"field\": \"match_weight\", \"type\": \"quantitative\"}, \"y\": {\"axis\": {\"format\": \"%\", \"title\": \"Percentage of unlinkable records\"}, \"field\": \"cum_prop\", \"type\": \"quantitative\"}}}, {\"mark\": {\"type\": \"point\"}, \"encoding\": {\"opacity\": {\"condition\": {\"param\": \"x_match_weight_y_cum_prop_coords_of_mouse\", \"value\": 1, \"empty\": false}, \"value\": 0}, \"tooltip\": [{\"field\": \"match_weight\", \"format\": \"+.5\", \"title\": \"Match weight\", \"type\": \"quantitative\"}, {\"field\": \"match_probability\", \"format\": \".5\", \"title\": \"Match probability\", \"type\": \"quantitative\"}, {\"field\": \"cum_prop\", \"format\": \".3%\", \"title\": \"Proportion of unlinkable records\", \"type\": \"quantitative\"}], \"x\": {\"axis\": {\"title\": \"Threshold match weight\"}, \"field\": \"match_weight\", \"type\": \"quantitative\"}, \"y\": {\"axis\": {\"format\": \"%\", \"title\": \"Percentage of unlinkable records\"}, \"field\": \"cum_prop\", \"type\": \"quantitative\"}}, \"name\": \"mouse_coords\"}, {\"mark\": {\"type\": \"rule\", \"color\": \"gray\"}, \"encoding\": {\"x\": {\"field\": \"match_weight\", \"type\": \"quantitative\"}}, \"transform\": [{\"filter\": {\"param\": \"x_match_weight_y_cum_prop_coords_of_mouse\", \"empty\": false}}]}, {\"mark\": {\"type\": \"rule\", \"color\": \"gray\"}, \"encoding\": {\"y\": {\"field\": \"cum_prop\", \"type\": \"quantitative\"}}, \"transform\": [{\"filter\": {\"param\": \"x_match_weight_y_cum_prop_coords_of_mouse\", \"empty\": false}}]}], \"data\": {\"name\": \"data-a3c158a1787f7c95d3342132e8144d9d\"}, \"height\": 400, \"params\": [{\"name\": \"x_match_weight_y_cum_prop_coords_of_mouse\", \"select\": {\"type\": \"point\", \"fields\": [\"match_weight\", \"cum_prop\"], \"nearest\": true, \"on\": \"mouseover\"}, \"views\": [\"mouse_coords\"]}], \"title\": {\"text\": \"Unlinkable records\", \"subtitle\": \"Records with insufficient information to exceed a given match threshold\"}, \"width\": 400, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.9.3.json\", \"datasets\": {\"data-a3c158a1787f7c95d3342132e8144d9d\": [{\"match_weight\": -12.85, \"match_probability\": 0.00014, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 3.9542883314425126e-05}, {\"match_weight\": -9.5, \"match_probability\": 0.00138, \"prop\": 5.931432679062709e-05, \"cum_prop\": 9.885721374303102e-05}, {\"match_weight\": -8.43, \"match_probability\": 0.00288, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00011862865358125418}, {\"match_weight\": -8.39, \"match_probability\": 0.00298, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00013840009341947734}, {\"match_weight\": -7.87, \"match_probability\": 0.00426, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0001581715332577005}, {\"match_weight\": -7.74, \"match_probability\": 0.00467, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00017794297309592366}, {\"match_weight\": -7.44, \"match_probability\": 0.00574, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00019771441293414682}, {\"match_weight\": -7.08, \"match_probability\": 0.00734, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00021748585277236998}, {\"match_weight\": -7.04, \"match_probability\": 0.00753, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00023725730716250837}, {\"match_weight\": -6.94, \"match_probability\": 0.00808, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0002768001868389547}, {\"match_weight\": -6.61, \"match_probability\": 0.01014, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0002965716412290931}, {\"match_weight\": -6.53, \"match_probability\": 0.01071, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.000316343066515401}, {\"match_weight\": -6.31, \"match_probability\": 0.01248, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0003361145209055394}, {\"match_weight\": -6.25, \"match_probability\": 0.01299, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0003558859461918473}, {\"match_weight\": -6.2, \"match_probability\": 0.01341, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0003756574005819857}, {\"match_weight\": -5.52, \"match_probability\": 0.02137, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00039542882586829364}, {\"match_weight\": -5.37, \"match_probability\": 0.02365, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00041520028025843203}, {\"match_weight\": -5.3, \"match_probability\": 0.02482, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00043497170554473996}, {\"match_weight\": -5.16, \"match_probability\": 0.02723, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00045474315993487835}, {\"match_weight\": -5.07, \"match_probability\": 0.02889, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00047451461432501674}, {\"match_weight\": -4.99, \"match_probability\": 0.03053, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.000533828919287771}, {\"match_weight\": -4.93, \"match_probability\": 0.03181, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0005536003736779094}, {\"match_weight\": -4.92, \"match_probability\": 0.03208, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0005733718280680478}, {\"match_weight\": -4.73, \"match_probability\": 0.03641, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0005931432824581861}, {\"match_weight\": -4.72, \"match_probability\": 0.03662, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0006129146786406636}, {\"match_weight\": -4.67, \"match_probability\": 0.03787, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.000632686133030802}, {\"match_weight\": -4.66, \"match_probability\": 0.03801, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0006524575874209404}, {\"match_weight\": -4.61, \"match_probability\": 0.03945, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0006722290418110788}, {\"match_weight\": -4.57, \"match_probability\": 0.04046, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0006920004379935563}, {\"match_weight\": -4.55, \"match_probability\": 0.04087, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.0007710862555541098}, {\"match_weight\": -4.44, \"match_probability\": 0.04405, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0007908576517365873}, {\"match_weight\": -1.08, \"match_probability\": 0.3208, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0016608011210337281}, {\"match_weight\": -0.96, \"match_probability\": 0.33914, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0016805725172162056}, {\"match_weight\": -0.89, \"match_probability\": 0.35056, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001700344029814005}, {\"match_weight\": -0.88, \"match_probability\": 0.35259, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0017201154259964824}, {\"match_weight\": -0.86, \"match_probability\": 0.35546, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0017398868221789598}, {\"match_weight\": -0.75, \"match_probability\": 0.37255, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0017596583347767591}, {\"match_weight\": -0.71, \"match_probability\": 0.37877, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0017794297309592366}, {\"match_weight\": -0.69, \"match_probability\": 0.38188, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001799201243557036}, {\"match_weight\": -0.69, \"match_probability\": 0.38267, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0018387440359219909}, {\"match_weight\": -0.67, \"match_probability\": 0.38641, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0018585155485197902}, {\"match_weight\": -0.62, \"match_probability\": 0.39415, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0018782869447022676}, {\"match_weight\": -0.61, \"match_probability\": 0.39519, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001898058457300067}, {\"match_weight\": -0.55, \"match_probability\": 0.40516, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0019178298534825444}, {\"match_weight\": -0.5, \"match_probability\": 0.41424, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001937601249665022}, {\"match_weight\": -0.45, \"match_probability\": 0.42299, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001957372762262821}, {\"match_weight\": -0.41, \"match_probability\": 0.43026, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0019771442748606205}, {\"match_weight\": -0.29, \"match_probability\": 0.45013, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.0020562298595905304}, {\"match_weight\": -0.28, \"match_probability\": 0.45172, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0020760013721883297}, {\"match_weight\": -0.26, \"match_probability\": 0.45584, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.002095772884786129}, {\"match_weight\": -0.14, \"match_probability\": 0.47551, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0021155441645532846}, {\"match_weight\": -0.08, \"match_probability\": 0.48577, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.002135315677151084}, {\"match_weight\": -0.05, \"match_probability\": 0.4919, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0021550871897488832}, {\"match_weight\": -0.04, \"match_probability\": 0.49362, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0021748587023466825}, {\"match_weight\": 0.02, \"match_probability\": 0.50416, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.002194629982113838}, {\"match_weight\": 0.25, \"match_probability\": 0.54248, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0022144014947116375}, {\"match_weight\": 0.33, \"match_probability\": 0.55743, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.0022737157996743917}, {\"match_weight\": 0.35, \"match_probability\": 0.56035, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.002293487312272191}, {\"match_weight\": 0.38, \"match_probability\": 0.56543, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0023132585920393467}, {\"match_weight\": 0.42, \"match_probability\": 0.57281, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.002333030104637146}, {\"match_weight\": 0.49, \"match_probability\": 0.58343, \"prop\": 0.0002965716412290931, \"cum_prop\": 0.002629601862281561}, {\"match_weight\": 0.51, \"match_probability\": 0.58731, \"prop\": 0.0003558859461918473, \"cum_prop\": 0.0029854876920580864}, {\"match_weight\": 0.58, \"match_probability\": 0.59934, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0030052592046558857}, {\"match_weight\": 0.63, \"match_probability\": 0.60704, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0030250304844230413}, {\"match_weight\": 0.63, \"match_probability\": 0.60722, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0030448019970208406}, {\"match_weight\": 0.66, \"match_probability\": 0.6116, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00306457350961864}, {\"match_weight\": 0.69, \"match_probability\": 0.61731, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0030843450222164392}, {\"match_weight\": 0.74, \"match_probability\": 0.62585, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.003104116301983595}, {\"match_weight\": 0.78, \"match_probability\": 0.63254, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.003123887814581394}, {\"match_weight\": 0.86, \"match_probability\": 0.64444, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0031436593271791935}, {\"match_weight\": 0.92, \"match_probability\": 0.65389, \"prop\": 0.0002768001868389547, \"cum_prop\": 0.0034204593393951654}, {\"match_weight\": 0.92, \"match_probability\": 0.6544, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0034402308519929647}, {\"match_weight\": 0.95, \"match_probability\": 0.65892, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.003460002364590764}, {\"match_weight\": 1.08, \"match_probability\": 0.67829, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0034797736443579197}, {\"match_weight\": 1.08, \"match_probability\": 0.67885, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.003499545156955719}, {\"match_weight\": 1.12, \"match_probability\": 0.68503, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0035193166695535183}, {\"match_weight\": 1.15, \"match_probability\": 0.69006, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0035390881821513176}, {\"match_weight\": 1.16, \"match_probability\": 0.69154, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0035588594619184732}, {\"match_weight\": 1.22, \"match_probability\": 0.69959, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.003598402487114072}, {\"match_weight\": 1.25, \"match_probability\": 0.70416, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0036181737668812275}, {\"match_weight\": 1.26, \"match_probability\": 0.70509, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.003637945279479027}, {\"match_weight\": 1.29, \"match_probability\": 0.70932, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.003657716792076826}, {\"match_weight\": 1.35, \"match_probability\": 0.71824, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.003697259584441781}, {\"match_weight\": 1.43, \"match_probability\": 0.72892, \"prop\": 0.00019771442748606205, \"cum_prop\": 0.003894974011927843}, {\"match_weight\": 1.46, \"match_probability\": 0.73345, \"prop\": 0.00025702876155264676, \"cum_prop\": 0.004152002744376659}, {\"match_weight\": 1.52, \"match_probability\": 0.74094, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.004171774256974459}, {\"match_weight\": 1.57, \"match_probability\": 0.74824, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.004191545769572258}, {\"match_weight\": 1.58, \"match_probability\": 0.74883, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.004270631354302168}, {\"match_weight\": 1.62, \"match_probability\": 0.75478, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.004428802989423275}, {\"match_weight\": 1.63, \"match_probability\": 0.75522, \"prop\": 0.00023725730716250837, \"cum_prop\": 0.004666060209274292}, {\"match_weight\": 1.66, \"match_probability\": 0.76014, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.004705603234469891}, {\"match_weight\": 1.68, \"match_probability\": 0.76202, \"prop\": 0.0002174858673242852, \"cum_prop\": 0.004923088941723108}, {\"match_weight\": 1.81, \"match_probability\": 0.7783, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.004942860454320908}, {\"match_weight\": 1.87, \"match_probability\": 0.78544, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.004962631966918707}, {\"match_weight\": 1.92, \"match_probability\": 0.79073, \"prop\": 0.0006722290418110788, \"cum_prop\": 0.005634861066937447}, {\"match_weight\": -4.32, \"match_probability\": 0.04778, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0008106291061267257}, {\"match_weight\": -4.25, \"match_probability\": 0.04986, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0008501720149070024}, {\"match_weight\": -4.16, \"match_probability\": 0.05312, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0008699434110894799}, {\"match_weight\": -3.92, \"match_probability\": 0.06205, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0008897148654796183}, {\"match_weight\": -3.81, \"match_probability\": 0.06655, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0009094863198697567}, {\"match_weight\": -3.8, \"match_probability\": 0.06689, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0009292577742598951}, {\"match_weight\": -3.7, \"match_probability\": 0.07167, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0009490292286500335}, {\"match_weight\": -3.67, \"match_probability\": 0.07298, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.000968800624832511}, {\"match_weight\": -3.4, \"match_probability\": 0.08631, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0010083435336127877}, {\"match_weight\": -3.33, \"match_probability\": 0.09067, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0010281149297952652}, {\"match_weight\": -3.22, \"match_probability\": 0.09705, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0010478864423930645}, {\"match_weight\": -3.02, \"match_probability\": 0.10995, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001067657838575542}, {\"match_weight\": -2.96, \"match_probability\": 0.11356, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0010874293511733413}, {\"match_weight\": -2.94, \"match_probability\": 0.11528, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0011269721435382962}, {\"match_weight\": -2.73, \"match_probability\": 0.1313, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0011467436561360955}, {\"match_weight\": -2.63, \"match_probability\": 0.13932, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001166515052318573}, {\"match_weight\": -2.4, \"match_probability\": 0.1592, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0011862864485010505}, {\"match_weight\": -2.33, \"match_probability\": 0.16623, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0012060579610988498}, {\"match_weight\": -1.99, \"match_probability\": 0.20088, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0012258293572813272}, {\"match_weight\": -1.99, \"match_probability\": 0.20122, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.0012851436622440815}, {\"match_weight\": -1.83, \"match_probability\": 0.21948, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0013049151748418808}, {\"match_weight\": -1.78, \"match_probability\": 0.2252, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0013444580836221576}, {\"match_weight\": -1.76, \"match_probability\": 0.22801, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0013840008759871125}, {\"match_weight\": -1.67, \"match_probability\": 0.23948, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0014037723885849118}, {\"match_weight\": -1.53, \"match_probability\": 0.25787, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0014235437847673893}, {\"match_weight\": -1.48, \"match_probability\": 0.26386, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0014433152973651886}, {\"match_weight\": -1.28, \"match_probability\": 0.29159, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.001463086693547666}, {\"match_weight\": -1.26, \"match_probability\": 0.29487, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0014828580897301435}, {\"match_weight\": -1.25, \"match_probability\": 0.2957, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0015026296023279428}, {\"match_weight\": -1.21, \"match_probability\": 0.30228, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.0016014868160709739}, {\"match_weight\": -1.17, \"match_probability\": 0.30816, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0016212582122534513}, {\"match_weight\": -1.14, \"match_probability\": 0.31172, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0016410297248512506}, {\"match_weight\": 2.02, \"match_probability\": 0.80254, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.005654632579535246}, {\"match_weight\": 2.09, \"match_probability\": 0.81002, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.005674403626471758}, {\"match_weight\": 2.1, \"match_probability\": 0.81039, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.005812803748995066}, {\"match_weight\": 2.12, \"match_probability\": 0.81328, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.005832575261592865}, {\"match_weight\": 2.14, \"match_probability\": 0.81497, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.0059314328245818615}, {\"match_weight\": 2.17, \"match_probability\": 0.81825, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0059512038715183735}, {\"match_weight\": 2.2, \"match_probability\": 0.82166, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.005970975384116173}, {\"match_weight\": 2.22, \"match_probability\": 0.82353, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.005990746896713972}, {\"match_weight\": 2.23, \"match_probability\": 0.82396, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006010518409311771}, {\"match_weight\": 2.27, \"match_probability\": 0.8285, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.00612914701923728}, {\"match_weight\": 2.3, \"match_probability\": 0.83073, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006148918531835079}, {\"match_weight\": 2.3, \"match_probability\": 0.83118, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0061686900444328785}, {\"match_weight\": 2.35, \"match_probability\": 0.83572, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006188461557030678}, {\"match_weight\": 2.35, \"match_probability\": 0.83611, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00620823260396719}, {\"match_weight\": 2.56, \"match_probability\": 0.85525, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006228004116564989}, {\"match_weight\": 2.57, \"match_probability\": 0.85619, \"prop\": 0.00025702876155264676, \"cum_prop\": 0.006485032849013805}, {\"match_weight\": 2.59, \"match_probability\": 0.85777, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006504804361611605}, {\"match_weight\": 2.63, \"match_probability\": 0.86115, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.006564118899405003}, {\"match_weight\": 2.64, \"match_probability\": 0.86149, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.0066432044841349125}, {\"match_weight\": 2.64, \"match_probability\": 0.86166, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006662975996732712}, {\"match_weight\": 2.66, \"match_probability\": 0.86305, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006682747509330511}, {\"match_weight\": 2.66, \"match_probability\": 0.86376, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00670251902192831}, {\"match_weight\": 2.67, \"match_probability\": 0.8639, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006722290068864822}, {\"match_weight\": 2.68, \"match_probability\": 0.8649, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006742061581462622}, {\"match_weight\": 2.69, \"match_probability\": 0.86615, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006761833094060421}, {\"match_weight\": 2.71, \"match_probability\": 0.86753, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00678160460665822}, {\"match_weight\": 2.76, \"match_probability\": 0.87136, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00680137611925602}, {\"match_weight\": 2.84, \"match_probability\": 0.87742, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006821147631853819}, {\"match_weight\": 2.87, \"match_probability\": 0.87935, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.006840919144451618}, {\"match_weight\": 2.91, \"match_probability\": 0.88238, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00686069019138813}, {\"match_weight\": 2.91, \"match_probability\": 0.88274, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0068804617039859295}, {\"match_weight\": 2.93, \"match_probability\": 0.88379, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.006959547754377127}, {\"match_weight\": 2.93, \"match_probability\": 0.88399, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.007038633339107037}, {\"match_weight\": 2.93, \"match_probability\": 0.88436, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007058404851704836}, {\"match_weight\": 2.94, \"match_probability\": 0.88439, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.007236347999423742}, {\"match_weight\": 2.97, \"match_probability\": 0.8866, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.007295662071555853}, {\"match_weight\": 2.98, \"match_probability\": 0.88741, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007315433584153652}, {\"match_weight\": 3.04, \"match_probability\": 0.89167, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0073352050967514515}, {\"match_weight\": 3.07, \"match_probability\": 0.89332, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.00745383370667696}, {\"match_weight\": 3.07, \"match_probability\": 0.89352, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.007493376731872559}, {\"match_weight\": 3.09, \"match_probability\": 0.89504, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007513147778809071}, {\"match_weight\": 3.1, \"match_probability\": 0.8957, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00753291929140687}, {\"match_weight\": 3.13, \"match_probability\": 0.89775, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007552690804004669}, {\"match_weight\": 3.14, \"match_probability\": 0.89827, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0075724623166024685}, {\"match_weight\": 3.17, \"match_probability\": 0.8998, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007592233829200268}, {\"match_weight\": 3.19, \"match_probability\": 0.90138, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007612005341798067}, {\"match_weight\": 3.21, \"match_probability\": 0.90253, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007631776388734579}, {\"match_weight\": 3.21, \"match_probability\": 0.90272, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007651547901332378}, {\"match_weight\": 3.22, \"match_probability\": 0.90284, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.0078097195364534855}, {\"match_weight\": 3.27, \"match_probability\": 0.90599, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007829491049051285}, {\"match_weight\": 3.28, \"match_probability\": 0.90639, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007849262095987797}, {\"match_weight\": 3.3, \"match_probability\": 0.90793, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007869034074246883}, {\"match_weight\": 3.31, \"match_probability\": 0.90839, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007888805121183395}, {\"match_weight\": 3.33, \"match_probability\": 0.90946, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007908577099442482}, {\"match_weight\": 3.34, \"match_probability\": 0.91034, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.007928348146378994}, {\"match_weight\": 3.38, \"match_probability\": 0.91231, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00794812012463808}, {\"match_weight\": 3.4, \"match_probability\": 0.91343, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.00804697722196579}, {\"match_weight\": 3.4, \"match_probability\": 0.91362, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008066748268902302}, {\"match_weight\": 3.44, \"match_probability\": 0.91561, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008086519315838814}, {\"match_weight\": 3.48, \"match_probability\": 0.91774, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.008126062341034412}, {\"match_weight\": 3.54, \"match_probability\": 0.92077, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008145834319293499}, {\"match_weight\": 3.55, \"match_probability\": 0.92148, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008165605366230011}, {\"match_weight\": 3.57, \"match_probability\": 0.92253, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008185377344489098}, {\"match_weight\": 3.58, \"match_probability\": 0.92266, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00820514839142561}, {\"match_weight\": 3.59, \"match_probability\": 0.92318, \"prop\": 0.00025702876155264676, \"cum_prop\": 0.008462177589535713}, {\"match_weight\": 3.59, \"match_probability\": 0.92321, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008481948636472225}, {\"match_weight\": 3.6, \"match_probability\": 0.92406, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.008659891784191132}, {\"match_weight\": 3.61, \"match_probability\": 0.92416, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008679662831127644}, {\"match_weight\": 3.63, \"match_probability\": 0.92508, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00869943480938673}, {\"match_weight\": 3.64, \"match_probability\": 0.92581, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.008837834931910038}, {\"match_weight\": 3.65, \"match_probability\": 0.92621, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00885760597884655}, {\"match_weight\": 3.7, \"match_probability\": 0.9287, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008877377025783062}, {\"match_weight\": 3.74, \"match_probability\": 0.93032, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008897149004042149}, {\"match_weight\": 3.74, \"match_probability\": 0.93057, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00891692005097866}, {\"match_weight\": 3.76, \"match_probability\": 0.93134, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008936692029237747}, {\"match_weight\": 3.76, \"match_probability\": 0.93135, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.008976235054433346}, {\"match_weight\": 3.77, \"match_probability\": 0.93171, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.008996006101369858}, {\"match_weight\": 3.77, \"match_probability\": 0.9318, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00901577714830637}, {\"match_weight\": 3.81, \"match_probability\": 0.93336, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.009114635176956654}, {\"match_weight\": 3.81, \"match_probability\": 0.93352, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009134406223893166}, {\"match_weight\": 3.82, \"match_probability\": 0.93405, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009154177270829678}, {\"match_weight\": 3.83, \"match_probability\": 0.93412, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.009193720296025276}, {\"match_weight\": 3.84, \"match_probability\": 0.93487, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009213492274284363}, {\"match_weight\": 3.85, \"match_probability\": 0.93503, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009233263321220875}, {\"match_weight\": 3.85, \"match_probability\": 0.93515, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.009371663443744183}, {\"match_weight\": 3.89, \"match_probability\": 0.93691, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.00939143542200327}, {\"match_weight\": 3.9, \"match_probability\": 0.93713, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009411206468939781}, {\"match_weight\": 3.91, \"match_probability\": 0.93761, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009430977515876293}, {\"match_weight\": 3.94, \"match_probability\": 0.93876, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.009589149616658688}, {\"match_weight\": 3.96, \"match_probability\": 0.93975, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0096089206635952}, {\"match_weight\": 4.03, \"match_probability\": 0.94217, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.00966823473572731}, {\"match_weight\": 4.06, \"match_probability\": 0.94349, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.009707777760922909}, {\"match_weight\": 4.06, \"match_probability\": 0.94354, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009727549739181995}, {\"match_weight\": 4.08, \"match_probability\": 0.94424, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009747320786118507}, {\"match_weight\": 4.1, \"match_probability\": 0.94478, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009767092764377594}, {\"match_weight\": 4.17, \"match_probability\": 0.94739, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009786863811314106}, {\"match_weight\": 4.18, \"match_probability\": 0.94785, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009806634858250618}, {\"match_weight\": 4.27, \"match_probability\": 0.95068, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.009846177883446217}, {\"match_weight\": 4.27, \"match_probability\": 0.95084, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.009865949861705303}, {\"match_weight\": 4.33, \"match_probability\": 0.95259, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.009905492886900902}, {\"match_weight\": 4.35, \"match_probability\": 0.95338, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.009945034980773926}, {\"match_weight\": 4.42, \"match_probability\": 0.95534, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.009984578005969524}, {\"match_weight\": 4.43, \"match_probability\": 0.95558, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.010024121031165123}, {\"match_weight\": 4.47, \"match_probability\": 0.95673, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010043892078101635}, {\"match_weight\": 4.48, \"match_probability\": 0.95698, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010063664056360722}, {\"match_weight\": 4.48, \"match_probability\": 0.95699, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.010142750106751919}, {\"match_weight\": 4.48, \"match_probability\": 0.95705, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.010182292200624943}, {\"match_weight\": 4.49, \"match_probability\": 0.95746, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.010300921276211739}, {\"match_weight\": 4.5, \"match_probability\": 0.9577, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.010380007326602936}, {\"match_weight\": 4.51, \"match_probability\": 0.95797, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010399778373539448}, {\"match_weight\": 4.53, \"match_probability\": 0.9584, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010419550351798534}, {\"match_weight\": 4.53, \"match_probability\": 0.95864, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.010478864423930645}, {\"match_weight\": 4.55, \"match_probability\": 0.95911, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.010538178496062756}, {\"match_weight\": 4.55, \"match_probability\": 0.95918, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010557950474321842}, {\"match_weight\": 4.58, \"match_probability\": 0.95995, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010577721521258354}, {\"match_weight\": 4.59, \"match_probability\": 0.96005, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010597492568194866}, {\"match_weight\": 4.6, \"match_probability\": 0.96047, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010617264546453953}, {\"match_weight\": 4.64, \"match_probability\": 0.96148, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.010716121643781662}, {\"match_weight\": 4.64, \"match_probability\": 0.96152, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010735892690718174}, {\"match_weight\": 4.65, \"match_probability\": 0.96172, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.01085452176630497}, {\"match_weight\": 4.66, \"match_probability\": 0.96195, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01091383583843708}, {\"match_weight\": 4.67, \"match_probability\": 0.96228, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010933607816696167}, {\"match_weight\": 4.69, \"match_probability\": 0.96264, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010953378863632679}, {\"match_weight\": 4.69, \"match_probability\": 0.96279, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010973149910569191}, {\"match_weight\": 4.7, \"match_probability\": 0.9629, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.010992921888828278}, {\"match_weight\": 4.72, \"match_probability\": 0.96346, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01101269293576479}, {\"match_weight\": 4.75, \"match_probability\": 0.9641, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.011052235960960388}, {\"match_weight\": 4.78, \"match_probability\": 0.96497, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011072007939219475}, {\"match_weight\": 4.79, \"match_probability\": 0.9651, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011091778986155987}, {\"match_weight\": 4.8, \"match_probability\": 0.9653, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011111550033092499}, {\"match_weight\": 4.8, \"match_probability\": 0.96531, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011131322011351585}, {\"match_weight\": 4.81, \"match_probability\": 0.96553, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.011210408061742783}, {\"match_weight\": 4.81, \"match_probability\": 0.96557, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011230179108679295}, {\"match_weight\": 4.81, \"match_probability\": 0.96562, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011249950155615807}, {\"match_weight\": 4.85, \"match_probability\": 0.96649, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.011289493180811405}, {\"match_weight\": 4.88, \"match_probability\": 0.96711, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011309265159070492}, {\"match_weight\": 4.88, \"match_probability\": 0.96721, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.011427893303334713}, {\"match_weight\": 4.9, \"match_probability\": 0.96769, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.011467436328530312}, {\"match_weight\": 4.91, \"match_probability\": 0.9678, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011487207375466824}, {\"match_weight\": 4.93, \"match_probability\": 0.96828, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01150697935372591}, {\"match_weight\": 4.94, \"match_probability\": 0.96841, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01156629342585802}, {\"match_weight\": 4.96, \"match_probability\": 0.96894, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.011645379476249218}, {\"match_weight\": 4.99, \"match_probability\": 0.96948, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.011704693548381329}, {\"match_weight\": 5.0, \"match_probability\": 0.96962, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011724465526640415}, {\"match_weight\": 5.0, \"match_probability\": 0.96966, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011744236573576927}, {\"match_weight\": 5.03, \"match_probability\": 0.97036, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01176400762051344}, {\"match_weight\": 5.04, \"match_probability\": 0.97059, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.011823322623968124}, {\"match_weight\": 5.08, \"match_probability\": 0.97122, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011843093670904636}, {\"match_weight\": 5.08, \"match_probability\": 0.97131, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011862865649163723}, {\"match_weight\": 5.08, \"match_probability\": 0.97133, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.011902407743036747}, {\"match_weight\": 5.11, \"match_probability\": 0.97189, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.011922179721295834}, {\"match_weight\": 5.11, \"match_probability\": 0.97193, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.011961722746491432}, {\"match_weight\": 5.17, \"match_probability\": 0.97299, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.012040807865560055}, {\"match_weight\": 5.24, \"match_probability\": 0.97423, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.012080350890755653}, {\"match_weight\": 5.25, \"match_probability\": 0.97447, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.012139664962887764}, {\"match_weight\": 5.27, \"match_probability\": 0.97472, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01215943694114685}, {\"match_weight\": 5.28, \"match_probability\": 0.97496, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012179207988083363}, {\"match_weight\": 5.33, \"match_probability\": 0.97569, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.012238522991538048}, {\"match_weight\": 5.33, \"match_probability\": 0.97582, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.012278065085411072}, {\"match_weight\": 5.36, \"match_probability\": 0.97618, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.012337380088865757}, {\"match_weight\": 5.36, \"match_probability\": 0.97628, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012357151135802269}, {\"match_weight\": 5.39, \"match_probability\": 0.97673, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012376923114061356}, {\"match_weight\": 5.39, \"match_probability\": 0.97674, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012396694160997868}, {\"match_weight\": 5.4, \"match_probability\": 0.97692, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.012515323236584663}, {\"match_weight\": 5.42, \"match_probability\": 0.97716, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.012673494406044483}, {\"match_weight\": 5.43, \"match_probability\": 0.97731, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012693265452980995}, {\"match_weight\": 5.44, \"match_probability\": 0.97741, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012713037431240082}, {\"match_weight\": 5.45, \"match_probability\": 0.97767, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01275258045643568}, {\"match_weight\": 5.47, \"match_probability\": 0.9779, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.01285143755376339}, {\"match_weight\": 5.47, \"match_probability\": 0.97793, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012871208600699902}, {\"match_weight\": 5.48, \"match_probability\": 0.97806, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.012890980578958988}, {\"match_weight\": 5.48, \"match_probability\": 0.97814, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.012950294651091099}, {\"match_weight\": 5.51, \"match_probability\": 0.97854, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.012989837676286697}, {\"match_weight\": 5.52, \"match_probability\": 0.97863, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01300960872322321}, {\"match_weight\": 5.52, \"match_probability\": 0.97864, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01306892279535532}, {\"match_weight\": 5.52, \"match_probability\": 0.97868, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013088694773614407}, {\"match_weight\": 5.53, \"match_probability\": 0.97883, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013108465820550919}, {\"match_weight\": 5.54, \"match_probability\": 0.97901, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013128237798810005}, {\"match_weight\": 5.57, \"match_probability\": 0.97935, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.013187551870942116}, {\"match_weight\": 5.57, \"match_probability\": 0.97937, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.013246865943074226}, {\"match_weight\": 5.59, \"match_probability\": 0.97962, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013266637921333313}, {\"match_weight\": 5.6, \"match_probability\": 0.97974, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013286408968269825}, {\"match_weight\": 5.6, \"match_probability\": 0.97987, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.013345723040401936}, {\"match_weight\": 5.62, \"match_probability\": 0.98003, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.013444580137729645}, {\"match_weight\": 5.63, \"match_probability\": 0.98015, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013464352115988731}, {\"match_weight\": 5.63, \"match_probability\": 0.98016, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013484123162925243}, {\"match_weight\": 5.65, \"match_probability\": 0.98052, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01350389514118433}, {\"match_weight\": 5.67, \"match_probability\": 0.98075, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013523666188120842}, {\"match_weight\": 5.7, \"match_probability\": 0.9811, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013543438166379929}, {\"match_weight\": 5.7, \"match_probability\": 0.98111, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01356320921331644}, {\"match_weight\": 5.72, \"match_probability\": 0.98135, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01360275223851204}, {\"match_weight\": 5.74, \"match_probability\": 0.9816, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.01372138038277626}, {\"match_weight\": 5.75, \"match_probability\": 0.98178, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.013760923407971859}, {\"match_weight\": 5.76, \"match_probability\": 0.98184, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.013840009458363056}, {\"match_weight\": 5.78, \"match_probability\": 0.98209, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013859780505299568}, {\"match_weight\": 5.8, \"match_probability\": 0.98234, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013879552483558655}, {\"match_weight\": 5.82, \"match_probability\": 0.98265, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013899323530495167}, {\"match_weight\": 5.84, \"match_probability\": 0.98284, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013919095508754253}, {\"match_weight\": 5.86, \"match_probability\": 0.9831, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013938866555690765}, {\"match_weight\": 5.86, \"match_probability\": 0.98311, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013958637602627277}, {\"match_weight\": 5.88, \"match_probability\": 0.98326, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013978409580886364}, {\"match_weight\": 5.9, \"match_probability\": 0.98351, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.013998180627822876}, {\"match_weight\": 5.9, \"match_probability\": 0.98358, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014017952606081963}, {\"match_weight\": 5.91, \"match_probability\": 0.98364, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014057495631277561}, {\"match_weight\": 5.93, \"match_probability\": 0.98383, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014077266678214073}, {\"match_weight\": 5.94, \"match_probability\": 0.98395, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014097037725150585}, {\"match_weight\": 5.95, \"match_probability\": 0.98408, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014136580750346184}, {\"match_weight\": 5.95, \"match_probability\": 0.98411, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014176123775541782}, {\"match_weight\": 5.95, \"match_probability\": 0.98413, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014215666800737381}, {\"match_weight\": 5.96, \"match_probability\": 0.98421, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014235437847673893}, {\"match_weight\": 5.96, \"match_probability\": 0.98422, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014274980872869492}, {\"match_weight\": 5.97, \"match_probability\": 0.98433, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014294752851128578}, {\"match_weight\": 6.0, \"match_probability\": 0.98458, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014334295876324177}, {\"match_weight\": 6.04, \"match_probability\": 0.98507, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.014452924020588398}, {\"match_weight\": 6.07, \"match_probability\": 0.98532, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01447269506752491}, {\"match_weight\": 6.08, \"match_probability\": 0.98547, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014492467045783997}, {\"match_weight\": 6.08, \"match_probability\": 0.98548, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014532010070979595}, {\"match_weight\": 6.09, \"match_probability\": 0.98554, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.014591324143111706}, {\"match_weight\": 6.11, \"match_probability\": 0.98577, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014630867168307304}, {\"match_weight\": 6.12, \"match_probability\": 0.9858, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014650638215243816}, {\"match_weight\": 6.12, \"match_probability\": 0.98586, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014690181240439415}, {\"match_weight\": 6.13, \"match_probability\": 0.98593, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.014729724265635014}, {\"match_weight\": 6.15, \"match_probability\": 0.98615, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014749495312571526}, {\"match_weight\": 6.18, \"match_probability\": 0.98642, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014769267290830612}, {\"match_weight\": 6.18, \"match_probability\": 0.98643, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014789038337767124}, {\"match_weight\": 6.2, \"match_probability\": 0.98657, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01484835334122181}, {\"match_weight\": 6.21, \"match_probability\": 0.98668, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014868124388158321}, {\"match_weight\": 6.22, \"match_probability\": 0.98676, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01490766741335392}, {\"match_weight\": 6.23, \"match_probability\": 0.98684, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014927438460290432}, {\"match_weight\": 6.25, \"match_probability\": 0.98707, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.014947210438549519}, {\"match_weight\": 6.26, \"match_probability\": 0.98716, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01500652451068163}, {\"match_weight\": 6.27, \"match_probability\": 0.98724, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.015046067535877228}, {\"match_weight\": 6.29, \"match_probability\": 0.98734, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01506583858281374}, {\"match_weight\": 6.3, \"match_probability\": 0.9875, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01512515265494585}, {\"match_weight\": 6.32, \"match_probability\": 0.98765, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015144924633204937}, {\"match_weight\": 6.37, \"match_probability\": 0.98807, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015164695680141449}, {\"match_weight\": 6.4, \"match_probability\": 0.98832, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015184467658400536}, {\"match_weight\": 6.4, \"match_probability\": 0.98833, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015204238705337048}, {\"match_weight\": 6.44, \"match_probability\": 0.98857, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.015243781730532646}, {\"match_weight\": 6.45, \"match_probability\": 0.98869, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015263552777469158}, {\"match_weight\": 6.46, \"match_probability\": 0.9888, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015283324755728245}, {\"match_weight\": 6.47, \"match_probability\": 0.98883, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.015401952899992466}, {\"match_weight\": 6.5, \"match_probability\": 0.98904, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015421724878251553}, {\"match_weight\": 6.51, \"match_probability\": 0.98914, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015441495925188065}, {\"match_weight\": 6.51, \"match_probability\": 0.98915, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015461267903447151}, {\"match_weight\": 6.51, \"match_probability\": 0.98916, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015481038950383663}, {\"match_weight\": 6.51, \"match_probability\": 0.98918, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01550081092864275}, {\"match_weight\": 6.52, \"match_probability\": 0.98923, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015520581975579262}, {\"match_weight\": 6.53, \"match_probability\": 0.98926, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015540353022515774}, {\"match_weight\": 6.53, \"match_probability\": 0.98931, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.015579896047711372}, {\"match_weight\": 6.54, \"match_probability\": 0.98933, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.015619439072906971}, {\"match_weight\": 6.55, \"match_probability\": 0.98941, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015639210119843483}, {\"match_weight\": 6.57, \"match_probability\": 0.98955, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01565898209810257}, {\"match_weight\": 6.58, \"match_probability\": 0.98968, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015678754076361656}, {\"match_weight\": 6.6, \"match_probability\": 0.98977, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015698524191975594}, {\"match_weight\": 6.6, \"match_probability\": 0.98983, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.01577761024236679}, {\"match_weight\": 6.62, \"match_probability\": 0.98995, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015797382220625877}, {\"match_weight\": 6.62, \"match_probability\": 0.98997, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015817154198884964}, {\"match_weight\": 6.63, \"match_probability\": 0.98998, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.015876468271017075}, {\"match_weight\": 6.64, \"match_probability\": 0.99007, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0159160103648901}, {\"match_weight\": 6.64, \"match_probability\": 0.99008, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015935782343149185}, {\"match_weight\": 6.65, \"match_probability\": 0.99015, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015955554321408272}, {\"match_weight\": 6.66, \"match_probability\": 0.99021, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01597532443702221}, {\"match_weight\": 6.68, \"match_probability\": 0.99033, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.015995096415281296}, {\"match_weight\": 6.71, \"match_probability\": 0.99054, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016014868393540382}, {\"match_weight\": 6.71, \"match_probability\": 0.99057, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01603463850915432}, {\"match_weight\": 6.72, \"match_probability\": 0.99059, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01609395444393158}, {\"match_weight\": 6.72, \"match_probability\": 0.99063, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.016133496537804604}, {\"match_weight\": 6.75, \"match_probability\": 0.99081, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.016173038631677628}, {\"match_weight\": 6.77, \"match_probability\": 0.99091, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016192810609936714}, {\"match_weight\": 6.78, \"match_probability\": 0.99096, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0162125825881958}, {\"match_weight\": 6.8, \"match_probability\": 0.99109, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01627189666032791}, {\"match_weight\": 6.81, \"match_probability\": 0.99114, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016291668638586998}, {\"match_weight\": 6.81, \"match_probability\": 0.99115, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016311438754200935}, {\"match_weight\": 6.81, \"match_probability\": 0.9912, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016331210732460022}, {\"match_weight\": 6.83, \"match_probability\": 0.99131, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01635098271071911}, {\"match_weight\": 6.84, \"match_probability\": 0.99134, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01641029678285122}, {\"match_weight\": 6.85, \"match_probability\": 0.99143, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016430068761110306}, {\"match_weight\": 6.88, \"match_probability\": 0.99157, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01646961085498333}, {\"match_weight\": 6.88, \"match_probability\": 0.99159, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.016509154811501503}, {\"match_weight\": 6.88, \"match_probability\": 0.9916, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.01658823899924755}, {\"match_weight\": 6.89, \"match_probability\": 0.99163, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016608010977506638}, {\"match_weight\": 6.89, \"match_probability\": 0.99166, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016627782955765724}, {\"match_weight\": 6.9, \"match_probability\": 0.9917, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01664755493402481}, {\"match_weight\": 6.92, \"match_probability\": 0.9918, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016667325049638748}, {\"match_weight\": 6.93, \"match_probability\": 0.99185, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.016746411100029945}, {\"match_weight\": 6.93, \"match_probability\": 0.99188, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016766183078289032}, {\"match_weight\": 6.94, \"match_probability\": 0.99193, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01678595505654812}, {\"match_weight\": 6.95, \"match_probability\": 0.99199, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016805725172162056}, {\"match_weight\": 6.96, \"match_probability\": 0.99202, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016825497150421143}, {\"match_weight\": 6.96, \"match_probability\": 0.99203, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01684526912868023}, {\"match_weight\": 6.97, \"match_probability\": 0.9921, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016865039244294167}, {\"match_weight\": 6.97, \"match_probability\": 0.99211, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016884811222553253}, {\"match_weight\": 7.01, \"match_probability\": 0.99229, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01690458320081234}, {\"match_weight\": 7.01, \"match_probability\": 0.99232, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016924355179071426}, {\"match_weight\": 7.02, \"match_probability\": 0.99233, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.016944125294685364}, {\"match_weight\": 7.02, \"match_probability\": 0.99235, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.016983669251203537}, {\"match_weight\": 7.03, \"match_probability\": 0.9924, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01702321134507656}, {\"match_weight\": 7.03, \"match_probability\": 0.99242, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01708252541720867}, {\"match_weight\": 7.04, \"match_probability\": 0.99244, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017102297395467758}, {\"match_weight\": 7.04, \"match_probability\": 0.99246, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017122069373726845}, {\"match_weight\": 7.05, \"match_probability\": 0.99249, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01716161146759987}, {\"match_weight\": 7.06, \"match_probability\": 0.99255, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.017201155424118042}, {\"match_weight\": 7.06, \"match_probability\": 0.99256, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01722092553973198}, {\"match_weight\": 7.07, \"match_probability\": 0.99259, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017240697517991066}, {\"match_weight\": 7.07, \"match_probability\": 0.9926, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017260469496250153}, {\"match_weight\": 7.07, \"match_probability\": 0.99261, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.017319783568382263}, {\"match_weight\": 7.07, \"match_probability\": 0.99262, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.017359325662255287}, {\"match_weight\": 7.08, \"match_probability\": 0.99265, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017379097640514374}, {\"match_weight\": 7.08, \"match_probability\": 0.99267, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01739886961877346}, {\"match_weight\": 7.08, \"match_probability\": 0.99268, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017418639734387398}, {\"match_weight\": 7.09, \"match_probability\": 0.99273, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017438411712646484}, {\"match_weight\": 7.11, \"match_probability\": 0.9928, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01745818369090557}, {\"match_weight\": 7.12, \"match_probability\": 0.99286, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01747795380651951}, {\"match_weight\": 7.12, \"match_probability\": 0.99288, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017497725784778595}, {\"match_weight\": 7.13, \"match_probability\": 0.99292, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01751749776303768}, {\"match_weight\": 7.16, \"match_probability\": 0.99307, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017537269741296768}, {\"match_weight\": 7.17, \"match_probability\": 0.99311, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.017576811835169792}, {\"match_weight\": 7.19, \"match_probability\": 0.99319, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01759658381342888}, {\"match_weight\": 7.2, \"match_probability\": 0.99324, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017616353929042816}, {\"match_weight\": 7.2, \"match_probability\": 0.99325, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017636125907301903}, {\"match_weight\": 7.21, \"match_probability\": 0.9933, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01765589788556099}, {\"match_weight\": 7.23, \"match_probability\": 0.99337, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.017734983935952187}, {\"match_weight\": 7.23, \"match_probability\": 0.99338, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.017814069986343384}, {\"match_weight\": 7.24, \"match_probability\": 0.99342, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01783384010195732}, {\"match_weight\": 7.24, \"match_probability\": 0.99344, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017853612080216408}, {\"match_weight\": 7.25, \"match_probability\": 0.99349, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017873384058475494}, {\"match_weight\": 7.28, \"match_probability\": 0.99359, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01791292615234852}, {\"match_weight\": 7.28, \"match_probability\": 0.99362, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01795247010886669}, {\"match_weight\": 7.31, \"match_probability\": 0.99373, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01797224022448063}, {\"match_weight\": 7.33, \"match_probability\": 0.99384, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.017992012202739716}, {\"match_weight\": 7.34, \"match_probability\": 0.99385, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018011784180998802}, {\"match_weight\": 7.34, \"match_probability\": 0.99387, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.01809087023139}, {\"match_weight\": 7.36, \"match_probability\": 0.99397, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018110640347003937}, {\"match_weight\": 7.38, \"match_probability\": 0.99402, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018130412325263023}, {\"match_weight\": 7.4, \"match_probability\": 0.99413, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.018189726397395134}, {\"match_weight\": 7.44, \"match_probability\": 0.99429, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01820949837565422}, {\"match_weight\": 7.45, \"match_probability\": 0.9943, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018229270353913307}, {\"match_weight\": 7.46, \"match_probability\": 0.99436, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018249040469527245}, {\"match_weight\": 7.46, \"match_probability\": 0.99437, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01826881244778633}, {\"match_weight\": 7.47, \"match_probability\": 0.99438, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.018308354541659355}, {\"match_weight\": 7.47, \"match_probability\": 0.99441, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018328126519918442}, {\"match_weight\": 7.48, \"match_probability\": 0.99444, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.018367670476436615}, {\"match_weight\": 7.51, \"match_probability\": 0.99453, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01840721257030964}, {\"match_weight\": 7.52, \"match_probability\": 0.99459, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018426984548568726}, {\"match_weight\": 7.54, \"match_probability\": 0.99467, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018446754664182663}, {\"match_weight\": 7.55, \"match_probability\": 0.99468, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01846652664244175}, {\"match_weight\": 7.55, \"match_probability\": 0.99471, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018486298620700836}, {\"match_weight\": 7.56, \"match_probability\": 0.99474, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018506068736314774}, {\"match_weight\": 7.57, \"match_probability\": 0.99475, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.018545612692832947}, {\"match_weight\": 7.58, \"match_probability\": 0.9948, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018565384671092033}, {\"match_weight\": 7.59, \"match_probability\": 0.99482, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01858515478670597}, {\"match_weight\": 7.6, \"match_probability\": 0.99487, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.01864446885883808}, {\"match_weight\": 7.6, \"match_probability\": 0.99488, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.018743326887488365}, {\"match_weight\": 7.6, \"match_probability\": 0.99489, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.018802640959620476}, {\"match_weight\": 7.61, \"match_probability\": 0.9949, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018822412937879562}, {\"match_weight\": 7.62, \"match_probability\": 0.99492, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.018881727010011673}, {\"match_weight\": 7.62, \"match_probability\": 0.99495, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01890149898827076}, {\"match_weight\": 7.63, \"match_probability\": 0.99498, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018921269103884697}, {\"match_weight\": 7.63, \"match_probability\": 0.99499, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.018941041082143784}, {\"match_weight\": 7.65, \"match_probability\": 0.99503, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.018980585038661957}, {\"match_weight\": 7.65, \"match_probability\": 0.99505, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.019039899110794067}, {\"match_weight\": 7.67, \"match_probability\": 0.99512, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.019118985161185265}, {\"match_weight\": 7.68, \"match_probability\": 0.99514, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.0192178413271904}, {\"match_weight\": 7.7, \"match_probability\": 0.9952, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019237613305449486}, {\"match_weight\": 7.7, \"match_probability\": 0.99522, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019257385283708572}, {\"match_weight\": 7.71, \"match_probability\": 0.99526, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01927715539932251}, {\"match_weight\": 7.73, \"match_probability\": 0.99531, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019296927377581596}, {\"match_weight\": 7.73, \"match_probability\": 0.99532, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019316699355840683}, {\"match_weight\": 7.74, \"match_probability\": 0.99535, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01933646947145462}, {\"match_weight\": 7.74, \"match_probability\": 0.99536, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.019376013427972794}, {\"match_weight\": 7.75, \"match_probability\": 0.99539, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.019415555521845818}, {\"match_weight\": 7.76, \"match_probability\": 0.9954, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.019534185528755188}, {\"match_weight\": 7.76, \"match_probability\": 0.99542, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019553955644369125}, {\"match_weight\": 7.77, \"match_probability\": 0.99545, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019573727622628212}, {\"match_weight\": 7.78, \"match_probability\": 0.99547, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0195934996008873}, {\"match_weight\": 7.79, \"match_probability\": 0.99549, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019613269716501236}, {\"match_weight\": 7.79, \"match_probability\": 0.99551, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019633041694760323}, {\"match_weight\": 7.8, \"match_probability\": 0.99552, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.019672585651278496}, {\"match_weight\": 7.8, \"match_probability\": 0.99553, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019692355766892433}, {\"match_weight\": 7.81, \"match_probability\": 0.99555, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01971212774515152}, {\"match_weight\": 7.81, \"match_probability\": 0.99557, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019731899723410606}, {\"match_weight\": 7.81, \"match_probability\": 0.99558, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.01977144181728363}, {\"match_weight\": 7.82, \"match_probability\": 0.99559, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.019810983911156654}, {\"match_weight\": 7.82, \"match_probability\": 0.99561, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01983075588941574}, {\"match_weight\": 7.85, \"match_probability\": 0.99569, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019850527867674828}, {\"match_weight\": 7.86, \"match_probability\": 0.99572, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019870299845933914}, {\"match_weight\": 7.86, \"match_probability\": 0.99573, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.01989006996154785}, {\"match_weight\": 7.88, \"match_probability\": 0.99578, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.019909841939806938}, {\"match_weight\": 7.89, \"match_probability\": 0.9958, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.02002847008407116}, {\"match_weight\": 7.9, \"match_probability\": 0.99584, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020048242062330246}, {\"match_weight\": 7.91, \"match_probability\": 0.99587, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020068014040589333}, {\"match_weight\": 7.93, \"match_probability\": 0.9959, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02008778415620327}, {\"match_weight\": 7.93, \"match_probability\": 0.99591, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02014710009098053}, {\"match_weight\": 7.94, \"match_probability\": 0.99595, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020166870206594467}, {\"match_weight\": 7.95, \"match_probability\": 0.99597, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020186642184853554}, {\"match_weight\": 7.96, \"match_probability\": 0.99601, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02020641416311264}, {\"match_weight\": 7.98, \"match_probability\": 0.99604, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020226184278726578}, {\"match_weight\": 7.99, \"match_probability\": 0.99607, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02026572823524475}, {\"match_weight\": 8.01, \"match_probability\": 0.99613, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020285500213503838}, {\"match_weight\": 8.02, \"match_probability\": 0.99616, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.020384356379508972}, {\"match_weight\": 8.03, \"match_probability\": 0.99619, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02040412835776806}, {\"match_weight\": 8.03, \"match_probability\": 0.9962, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020423900336027145}, {\"match_weight\": 8.04, \"match_probability\": 0.99621, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02046344242990017}, {\"match_weight\": 8.05, \"match_probability\": 0.99624, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020483214408159256}, {\"match_weight\": 8.06, \"match_probability\": 0.99626, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020502984523773193}, {\"match_weight\": 8.09, \"match_probability\": 0.99635, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02052275650203228}, {\"match_weight\": 8.1, \"match_probability\": 0.99638, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02058207057416439}, {\"match_weight\": 8.11, \"match_probability\": 0.99639, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.020621614530682564}, {\"match_weight\": 8.12, \"match_probability\": 0.99641, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0206413846462965}, {\"match_weight\": 8.12, \"match_probability\": 0.99642, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.02076001465320587}, {\"match_weight\": 8.14, \"match_probability\": 0.99646, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02077978476881981}, {\"match_weight\": 8.14, \"match_probability\": 0.99647, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020799556747078896}, {\"match_weight\": 8.15, \"match_probability\": 0.9965, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020819328725337982}, {\"match_weight\": 8.17, \"match_probability\": 0.99653, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02083910070359707}, {\"match_weight\": 8.17, \"match_probability\": 0.99655, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.020858870819211006}, {\"match_weight\": 8.18, \"match_probability\": 0.99656, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.020937956869602203}, {\"match_weight\": 8.19, \"match_probability\": 0.99658, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02095772884786129}, {\"match_weight\": 8.21, \"match_probability\": 0.99664, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.021036814898252487}, {\"match_weight\": 8.22, \"match_probability\": 0.99665, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021056585013866425}, {\"match_weight\": 8.22, \"match_probability\": 0.99666, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.021096128970384598}, {\"match_weight\": 8.23, \"match_probability\": 0.99667, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.02121475711464882}, {\"match_weight\": 8.24, \"match_probability\": 0.99669, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021234529092907906}, {\"match_weight\": 8.25, \"match_probability\": 0.99672, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021254299208521843}, {\"match_weight\": 8.26, \"match_probability\": 0.99674, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02127407118678093}, {\"match_weight\": 8.27, \"match_probability\": 0.99676, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021293843165040016}, {\"match_weight\": 8.28, \"match_probability\": 0.99679, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021313615143299103}, {\"match_weight\": 8.28, \"match_probability\": 0.9968, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02133338525891304}, {\"match_weight\": 8.3, \"match_probability\": 0.99683, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021353157237172127}, {\"match_weight\": 8.31, \"match_probability\": 0.99686, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02139269933104515}, {\"match_weight\": 8.33, \"match_probability\": 0.99691, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.021432243287563324}, {\"match_weight\": 8.34, \"match_probability\": 0.99693, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.021570643410086632}, {\"match_weight\": 8.35, \"match_probability\": 0.99694, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.021610185503959656}, {\"match_weight\": 8.37, \"match_probability\": 0.99698, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.021669499576091766}, {\"match_weight\": 8.38, \"match_probability\": 0.997, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02170904353260994}, {\"match_weight\": 8.38, \"match_probability\": 0.99702, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021728815510869026}, {\"match_weight\": 8.4, \"match_probability\": 0.99706, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021748585626482964}, {\"match_weight\": 8.41, \"match_probability\": 0.99707, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02176835760474205}, {\"match_weight\": 8.42, \"match_probability\": 0.99708, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021788129583001137}, {\"match_weight\": 8.42, \"match_probability\": 0.99709, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.021807899698615074}, {\"match_weight\": 8.43, \"match_probability\": 0.99711, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02182767167687416}, {\"match_weight\": 8.46, \"match_probability\": 0.99717, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.021867215633392334}, {\"match_weight\": 8.47, \"match_probability\": 0.99718, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.021985843777656555}, {\"match_weight\": 8.48, \"match_probability\": 0.9972, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02202538587152958}, {\"match_weight\": 8.48, \"match_probability\": 0.99721, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02208469994366169}, {\"match_weight\": 8.5, \"match_probability\": 0.99724, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.022104471921920776}, {\"match_weight\": 8.52, \"match_probability\": 0.99728, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02214401587843895}, {\"match_weight\": 8.53, \"match_probability\": 0.99729, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.022183557972311974}, {\"match_weight\": 8.53, \"match_probability\": 0.99731, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.022223100066184998}, {\"match_weight\": 8.57, \"match_probability\": 0.99737, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.022242872044444084}, {\"match_weight\": 8.57, \"match_probability\": 0.99738, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.022302186116576195}, {\"match_weight\": 8.59, \"match_probability\": 0.99741, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.022341730073094368}, {\"match_weight\": 8.6, \"match_probability\": 0.99743, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.022361500188708305}, {\"match_weight\": 8.61, \"match_probability\": 0.99744, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.0225196722894907}, {\"match_weight\": 8.61, \"match_probability\": 0.99745, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.022539444267749786}, {\"match_weight\": 8.62, \"match_probability\": 0.99746, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02257898636162281}, {\"match_weight\": 8.63, \"match_probability\": 0.99747, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.022618530318140984}, {\"match_weight\": 8.63, \"match_probability\": 0.99748, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.02275693044066429}, {\"match_weight\": 8.64, \"match_probability\": 0.99749, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02277670055627823}, {\"match_weight\": 8.64, \"match_probability\": 0.9975, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02283601462841034}, {\"match_weight\": 8.65, \"match_probability\": 0.99751, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.022855786606669426}, {\"match_weight\": 8.65, \"match_probability\": 0.99752, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.022915100678801537}, {\"match_weight\": 8.67, \"match_probability\": 0.99756, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.022934872657060623}, {\"match_weight\": 8.68, \"match_probability\": 0.99757, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.022974414750933647}, {\"match_weight\": 8.69, \"match_probability\": 0.99759, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.023093044757843018}, {\"match_weight\": 8.7, \"match_probability\": 0.9976, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02313258685171604}, {\"match_weight\": 8.72, \"match_probability\": 0.99763, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.023152358829975128}, {\"match_weight\": 8.73, \"match_probability\": 0.99765, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.023172130808234215}, {\"match_weight\": 8.74, \"match_probability\": 0.99766, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.023231444880366325}, {\"match_weight\": 8.75, \"match_probability\": 0.99768, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.023251214995980263}, {\"match_weight\": 8.76, \"match_probability\": 0.99769, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.023290758952498436}, {\"match_weight\": 8.76, \"match_probability\": 0.99771, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02333030104637146}, {\"match_weight\": 8.78, \"match_probability\": 0.99772, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.023409387096762657}, {\"match_weight\": 8.78, \"match_probability\": 0.99773, \"prop\": 0.0007513148011639714, \"cum_prop\": 0.024160701781511307}, {\"match_weight\": 8.79, \"match_probability\": 0.99774, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02420024573802948}, {\"match_weight\": 8.79, \"match_probability\": 0.99775, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.024239787831902504}, {\"match_weight\": 8.8, \"match_probability\": 0.99776, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.024279329925775528}, {\"match_weight\": 8.8, \"match_probability\": 0.99777, \"prop\": 0.0005931432824581861, \"cum_prop\": 0.024872474372386932}, {\"match_weight\": 8.82, \"match_probability\": 0.99779, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02489224635064602}, {\"match_weight\": 8.83, \"match_probability\": 0.9978, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.024912016466259956}, {\"match_weight\": 8.84, \"match_probability\": 0.99783, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.024971330538392067}, {\"match_weight\": 8.86, \"match_probability\": 0.99785, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.024991102516651154}, {\"match_weight\": 8.87, \"match_probability\": 0.99787, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02501087449491024}, {\"match_weight\": 8.89, \"match_probability\": 0.99789, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025030644610524178}, {\"match_weight\": 8.89, \"match_probability\": 0.9979, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02507018856704235}, {\"match_weight\": 8.9, \"match_probability\": 0.99791, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025089960545301437}, {\"match_weight\": 8.91, \"match_probability\": 0.99792, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02512950263917446}, {\"match_weight\": 8.92, \"match_probability\": 0.99794, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.025169044733047485}, {\"match_weight\": 8.93, \"match_probability\": 0.99795, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.02532721683382988}, {\"match_weight\": 8.94, \"match_probability\": 0.99796, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025346988812088966}, {\"match_weight\": 8.94, \"match_probability\": 0.99797, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02538653090596199}, {\"match_weight\": 8.97, \"match_probability\": 0.998, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.025426074862480164}, {\"match_weight\": 8.97, \"match_probability\": 0.99801, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.0254458449780941}, {\"match_weight\": 8.99, \"match_probability\": 0.99803, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.025524931028485298}, {\"match_weight\": 9.0, \"match_probability\": 0.99805, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025544703006744385}, {\"match_weight\": 9.01, \"match_probability\": 0.99806, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02558424510061741}, {\"match_weight\": 9.02, \"match_probability\": 0.99807, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025604017078876495}, {\"match_weight\": 9.03, \"match_probability\": 0.99809, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02564356103539467}, {\"match_weight\": 9.04, \"match_probability\": 0.9981, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025663331151008606}, {\"match_weight\": 9.05, \"match_probability\": 0.99811, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025683103129267693}, {\"match_weight\": 9.05, \"match_probability\": 0.99812, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02570287510752678}, {\"match_weight\": 9.07, \"match_probability\": 0.99814, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025722645223140717}, {\"match_weight\": 9.07, \"match_probability\": 0.99815, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.025742417201399803}, {\"match_weight\": 9.09, \"match_probability\": 0.99816, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02576218917965889}, {\"match_weight\": 9.09, \"match_probability\": 0.99817, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.025841275230050087}, {\"match_weight\": 9.1, \"match_probability\": 0.99818, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02588081732392311}, {\"match_weight\": 9.11, \"match_probability\": 0.99819, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02594013139605522}, {\"match_weight\": 9.12, \"match_probability\": 0.9982, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.025979675352573395}, {\"match_weight\": 9.13, \"match_probability\": 0.99821, \"prop\": 0.00025702876155264676, \"cum_prop\": 0.026236703619360924}, {\"match_weight\": 9.14, \"match_probability\": 0.99823, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.026276245713233948}, {\"match_weight\": 9.15, \"match_probability\": 0.99824, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02631578966975212}, {\"match_weight\": 9.16, \"match_probability\": 0.99825, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02633555978536606}, {\"match_weight\": 9.17, \"match_probability\": 0.99826, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.026414645835757256}, {\"match_weight\": 9.18, \"match_probability\": 0.99828, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02645418979227543}, {\"match_weight\": 9.19, \"match_probability\": 0.99829, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.026473959907889366}, {\"match_weight\": 9.2, \"match_probability\": 0.9983, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02651350386440754}, {\"match_weight\": 9.21, \"match_probability\": 0.99831, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.026533275842666626}, {\"match_weight\": 9.22, \"match_probability\": 0.99832, \"prop\": 0.000316343066515401, \"cum_prop\": 0.026849618181586266}, {\"match_weight\": 9.23, \"match_probability\": 0.99833, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.026908932253718376}, {\"match_weight\": 9.23, \"match_probability\": 0.99834, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02694847621023655}, {\"match_weight\": 9.24, \"match_probability\": 0.99835, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.026968246325850487}, {\"match_weight\": 9.26, \"match_probability\": 0.99837, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.027027560397982597}, {\"match_weight\": 9.27, \"match_probability\": 0.99838, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02706710435450077}, {\"match_weight\": 9.27, \"match_probability\": 0.99839, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.027146190404891968}, {\"match_weight\": 9.29, \"match_probability\": 0.9984, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.027185732498764992}, {\"match_weight\": 9.3, \"match_probability\": 0.99841, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.027225276455283165}, {\"match_weight\": 9.3, \"match_probability\": 0.99842, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.027245046570897102}, {\"match_weight\": 9.32, \"match_probability\": 0.99844, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02726481854915619}, {\"match_weight\": 9.33, \"match_probability\": 0.99845, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.0273241326212883}, {\"match_weight\": 9.34, \"match_probability\": 0.99846, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.027462532743811607}, {\"match_weight\": 9.37, \"match_probability\": 0.99849, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.027521846815943718}, {\"match_weight\": 9.38, \"match_probability\": 0.9985, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.027541618794202805}, {\"match_weight\": 9.39, \"match_probability\": 0.99851, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02758116088807583}, {\"match_weight\": 9.4, \"match_probability\": 0.99852, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.027660246938467026}, {\"match_weight\": 9.41, \"match_probability\": 0.99853, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.027719561010599136}, {\"match_weight\": 9.41, \"match_probability\": 0.99854, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.027778875082731247}, {\"match_weight\": 9.43, \"match_probability\": 0.99855, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.027838191017508507}, {\"match_weight\": 9.44, \"match_probability\": 0.99856, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.027857961133122444}, {\"match_weight\": 9.45, \"match_probability\": 0.99857, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.027956819161772728}, {\"match_weight\": 9.46, \"match_probability\": 0.99858, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02801613323390484}, {\"match_weight\": 9.47, \"match_probability\": 0.99859, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.028095219284296036}, {\"match_weight\": 9.48, \"match_probability\": 0.9986, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02813476137816906}, {\"match_weight\": 9.49, \"match_probability\": 0.99861, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.02819407545030117}, {\"match_weight\": 9.5, \"match_probability\": 0.99862, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.028213847428560257}, {\"match_weight\": 9.51, \"match_probability\": 0.99863, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02825339138507843}, {\"match_weight\": 9.52, \"match_probability\": 0.99864, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.028273161500692368}, {\"match_weight\": 9.53, \"match_probability\": 0.99865, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.028352247551083565}, {\"match_weight\": 9.55, \"match_probability\": 0.99866, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.028391791507601738}, {\"match_weight\": 9.56, \"match_probability\": 0.99867, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.028490647673606873}, {\"match_weight\": 9.57, \"match_probability\": 0.99868, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.02856973372399807}, {\"match_weight\": 9.58, \"match_probability\": 0.99869, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.028609275817871094}, {\"match_weight\": 9.6, \"match_probability\": 0.99871, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.028708133846521378}, {\"match_weight\": 9.6, \"match_probability\": 0.99872, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.028866305947303772}, {\"match_weight\": 9.62, \"match_probability\": 0.99873, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.028905848041176796}, {\"match_weight\": 9.64, \"match_probability\": 0.99874, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.028965162113308907}, {\"match_weight\": 9.64, \"match_probability\": 0.99875, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.02900470606982708}, {\"match_weight\": 9.65, \"match_probability\": 0.99876, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.029044248163700104}, {\"match_weight\": 9.67, \"match_probability\": 0.99877, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.029103562235832214}, {\"match_weight\": 9.68, \"match_probability\": 0.99878, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.029143106192350388}, {\"match_weight\": 9.69, \"match_probability\": 0.99879, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.029162876307964325}, {\"match_weight\": 9.7, \"match_probability\": 0.9988, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.02918264828622341}, {\"match_weight\": 9.72, \"match_probability\": 0.99881, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.029202420264482498}, {\"match_weight\": 9.73, \"match_probability\": 0.99882, \"prop\": 0.0004349717346485704, \"cum_prop\": 0.02963739074766636}, {\"match_weight\": 9.75, \"match_probability\": 0.99884, \"prop\": 0.0003558859461918473, \"cum_prop\": 0.029993277043104172}, {\"match_weight\": 9.76, \"match_probability\": 0.99885, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.030032820999622345}, {\"match_weight\": 9.77, \"match_probability\": 0.99886, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.030052591115236282}, {\"match_weight\": 9.79, \"match_probability\": 0.99887, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.03013167716562748}, {\"match_weight\": 9.81, \"match_probability\": 0.99888, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.030230535194277763}, {\"match_weight\": 9.81, \"match_probability\": 0.99889, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.030289849266409874}, {\"match_weight\": 9.83, \"match_probability\": 0.9989, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.03036893531680107}, {\"match_weight\": 9.85, \"match_probability\": 0.99891, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.030428249388933182}, {\"match_weight\": 9.85, \"match_probability\": 0.99892, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.03044802136719227}, {\"match_weight\": 9.87, \"match_probability\": 0.99893, \"prop\": 0.0003361145209055394, \"cum_prop\": 0.030784135684370995}, {\"match_weight\": 9.88, \"match_probability\": 0.99894, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.030843449756503105}, {\"match_weight\": 9.9, \"match_probability\": 0.99895, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.030922535806894302}, {\"match_weight\": 9.9, \"match_probability\": 0.99896, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.03094230592250824}, {\"match_weight\": 9.93, \"match_probability\": 0.99897, \"prop\": 0.001028115046210587, \"cum_prop\": 0.031970422714948654}, {\"match_weight\": 9.94, \"match_probability\": 0.99898, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.0320495069026947}, {\"match_weight\": 9.95, \"match_probability\": 0.99899, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.03210882097482681}, {\"match_weight\": 9.99, \"match_probability\": 0.99901, \"prop\": 0.0002965716412290931, \"cum_prop\": 0.032405395060777664}, {\"match_weight\": 9.99, \"match_probability\": 0.99902, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.03244493529200554}, {\"match_weight\": 10.02, \"match_probability\": 0.99903, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.03248447924852371}, {\"match_weight\": 10.03, \"match_probability\": 0.99904, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.032524023205041885}, {\"match_weight\": 10.04, \"match_probability\": 0.99905, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.032583337277173996}, {\"match_weight\": 10.06, \"match_probability\": 0.99906, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.03264265134930611}, {\"match_weight\": 10.07, \"match_probability\": 0.99907, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.03270196542143822}, {\"match_weight\": 10.09, \"match_probability\": 0.99908, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.03276127949357033}, {\"match_weight\": 10.11, \"match_probability\": 0.99909, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.032899677753448486}, {\"match_weight\": 10.11, \"match_probability\": 0.9991, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.032958995550870895}, {\"match_weight\": 10.14, \"match_probability\": 0.99911, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.033018309623003006}, {\"match_weight\": 10.14, \"match_probability\": 0.99912, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.03303807973861694}, {\"match_weight\": 10.17, \"match_probability\": 0.99913, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.03313693776726723}, {\"match_weight\": 10.18, \"match_probability\": 0.99914, \"prop\": 1.9771441657212563e-05, \"cum_prop\": 0.033156707882881165}, {\"match_weight\": 10.2, \"match_probability\": 0.99915, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.03325556591153145}, {\"match_weight\": 10.22, \"match_probability\": 0.99916, \"prop\": 0.0009094863198697567, \"cum_prop\": 0.03416505083441734}, {\"match_weight\": 10.24, \"match_probability\": 0.99917, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.034224364906549454}, {\"match_weight\": 10.26, \"match_probability\": 0.99918, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.03432322293519974}, {\"match_weight\": 10.28, \"match_probability\": 0.99919, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.034402310848236084}, {\"match_weight\": 10.28, \"match_probability\": 0.9992, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.034520938992500305}, {\"match_weight\": 10.3, \"match_probability\": 0.99921, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.034639567136764526}, {\"match_weight\": 10.33, \"match_probability\": 0.99922, \"prop\": 3.9542883314425126e-05, \"cum_prop\": 0.0346791110932827}, {\"match_weight\": 10.35, \"match_probability\": 0.99923, \"prop\": 0.0003954288549721241, \"cum_prop\": 0.035074539482593536}, {\"match_weight\": 10.36, \"match_probability\": 0.99924, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.03513385355472565}, {\"match_weight\": 10.39, \"match_probability\": 0.99925, \"prop\": 0.0004349717346485704, \"cum_prop\": 0.03556882590055466}, {\"match_weight\": 10.4, \"match_probability\": 0.99926, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.03566768020391464}, {\"match_weight\": 10.42, \"match_probability\": 0.99927, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.03582585230469704}, {\"match_weight\": 10.45, \"match_probability\": 0.99928, \"prop\": 0.000316343066515401, \"cum_prop\": 0.036142196506261826}, {\"match_weight\": 10.46, \"match_probability\": 0.99929, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.036201510578393936}, {\"match_weight\": 10.49, \"match_probability\": 0.9993, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.03635968267917633}, {\"match_weight\": 10.51, \"match_probability\": 0.99931, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.03653762489557266}, {\"match_weight\": 10.53, \"match_probability\": 0.99932, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.036656253039836884}, {\"match_weight\": 10.55, \"match_probability\": 0.99933, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.036774881184101105}, {\"match_weight\": 10.57, \"match_probability\": 0.99934, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.0369330532848835}, {\"match_weight\": 10.6, \"match_probability\": 0.99935, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.03705168142914772}, {\"match_weight\": 10.62, \"match_probability\": 0.99936, \"prop\": 0.00019771442748606205, \"cum_prop\": 0.03724939748644829}, {\"match_weight\": 10.63, \"match_probability\": 0.99937, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.03736802563071251}, {\"match_weight\": 10.66, \"match_probability\": 0.99938, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.03742733970284462}, {\"match_weight\": 10.69, \"match_probability\": 0.99939, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.03754596784710884}, {\"match_weight\": 10.71, \"match_probability\": 0.9994, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.03762505576014519}, {\"match_weight\": 10.73, \"match_probability\": 0.99941, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.03778322413563728}, {\"match_weight\": 10.76, \"match_probability\": 0.99942, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.0379018560051918}, {\"match_weight\": 10.79, \"match_probability\": 0.99943, \"prop\": 0.0003361145209055394, \"cum_prop\": 0.03823797032237053}, {\"match_weight\": 10.81, \"match_probability\": 0.99944, \"prop\": 7.908576662885025e-05, \"cum_prop\": 0.03831705451011658}, {\"match_weight\": 10.83, \"match_probability\": 0.99945, \"prop\": 0.00019771442748606205, \"cum_prop\": 0.038514770567417145}, {\"match_weight\": 10.87, \"match_probability\": 0.99946, \"prop\": 0.0004349717346485704, \"cum_prop\": 0.038949739187955856}, {\"match_weight\": 10.88, \"match_probability\": 0.99947, \"prop\": 5.931432679062709e-05, \"cum_prop\": 0.039009056985378265}, {\"match_weight\": 10.92, \"match_probability\": 0.99948, \"prop\": 0.0002965716412290931, \"cum_prop\": 0.03930562734603882}, {\"match_weight\": 10.95, \"match_probability\": 0.99949, \"prop\": 0.0005931432824581861, \"cum_prop\": 0.03989877179265022}, {\"match_weight\": 10.97, \"match_probability\": 0.9995, \"prop\": 0.00019771442748606205, \"cum_prop\": 0.04009648412466049}, {\"match_weight\": 11.01, \"match_probability\": 0.99951, \"prop\": 0.0002174858673242852, \"cum_prop\": 0.040313970297575}, {\"match_weight\": 11.03, \"match_probability\": 0.99952, \"prop\": 9.885721374303102e-05, \"cum_prop\": 0.04041282832622528}, {\"match_weight\": 11.07, \"match_probability\": 0.99953, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.04059077054262161}, {\"match_weight\": 11.1, \"match_probability\": 0.99954, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.040768712759017944}, {\"match_weight\": 11.13, \"match_probability\": 0.99955, \"prop\": 0.00023725730716250837, \"cum_prop\": 0.04100596904754639}, {\"match_weight\": 11.16, \"match_probability\": 0.99956, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.04118391498923302}, {\"match_weight\": 11.2, \"match_probability\": 0.99957, \"prop\": 0.00011862865358125418, \"cum_prop\": 0.04130254313349724}, {\"match_weight\": 11.23, \"match_probability\": 0.99958, \"prop\": 0.0006722290418110788, \"cum_prop\": 0.04197477176785469}, {\"match_weight\": 11.27, \"match_probability\": 0.99959, \"prop\": 0.000632686133030802, \"cum_prop\": 0.04260745644569397}, {\"match_weight\": 11.3, \"match_probability\": 0.9996, \"prop\": 0.0002768001868389547, \"cum_prop\": 0.042884256690740585}, {\"match_weight\": 11.34, \"match_probability\": 0.99961, \"prop\": 0.00017794297309592366, \"cum_prop\": 0.043062202632427216}, {\"match_weight\": 11.38, \"match_probability\": 0.99962, \"prop\": 0.0005536003736779094, \"cum_prop\": 0.04361579939723015}, {\"match_weight\": 11.42, \"match_probability\": 0.99963, \"prop\": 0.0001581715332577005, \"cum_prop\": 0.04377397149801254}, {\"match_weight\": 11.46, \"match_probability\": 0.99964, \"prop\": 0.0002174858673242852, \"cum_prop\": 0.04399145767092705}, {\"match_weight\": 11.5, \"match_probability\": 0.99965, \"prop\": 0.0002965716412290931, \"cum_prop\": 0.0442880317568779}, {\"match_weight\": 11.54, \"match_probability\": 0.99966, \"prop\": 0.000316343066515401, \"cum_prop\": 0.04460437223315239}, {\"match_weight\": 11.58, \"match_probability\": 0.99967, \"prop\": 0.00019771442748606205, \"cum_prop\": 0.04480208829045296}, {\"match_weight\": 11.63, \"match_probability\": 0.99968, \"prop\": 0.00013840009341947734, \"cum_prop\": 0.044940486550331116}, {\"match_weight\": 11.67, \"match_probability\": 0.99969, \"prop\": 0.000316343066515401, \"cum_prop\": 0.045256830751895905}, {\"match_weight\": 11.72, \"match_probability\": 0.9997, \"prop\": 0.0003756574005819857, \"cum_prop\": 0.045632489025592804}, {\"match_weight\": 11.77, \"match_probability\": 0.99971, \"prop\": 0.0002965716412290931, \"cum_prop\": 0.04592905938625336}, {\"match_weight\": 11.83, \"match_probability\": 0.99972, \"prop\": 0.0002174858673242852, \"cum_prop\": 0.04614654555916786}, {\"match_weight\": 11.88, \"match_probability\": 0.99973, \"prop\": 0.0006920004962012172, \"cum_prop\": 0.04683854803442955}, {\"match_weight\": 11.93, \"match_probability\": 0.99974, \"prop\": 0.0011269721435382962, \"cum_prop\": 0.04796551913022995}, {\"match_weight\": 11.99, \"match_probability\": 0.99975, \"prop\": 0.0004942860687151551, \"cum_prop\": 0.04845980554819107}, {\"match_weight\": 12.05, \"match_probability\": 0.99976, \"prop\": 0.0005536003736779094, \"cum_prop\": 0.0490134060382843}, {\"match_weight\": 12.12, \"match_probability\": 0.99977, \"prop\": 0.0007908577099442482, \"cum_prop\": 0.049804262816905975}, {\"match_weight\": 12.18, \"match_probability\": 0.99978, \"prop\": 0.0004942860687151551, \"cum_prop\": 0.050298549234867096}, {\"match_weight\": 12.24, \"match_probability\": 0.99979, \"prop\": 0.00041520028025843203, \"cum_prop\": 0.05071374773979187}, {\"match_weight\": 12.32, \"match_probability\": 0.9998, \"prop\": 0.0005536003736779094, \"cum_prop\": 0.0512673482298851}, {\"match_weight\": 12.39, \"match_probability\": 0.99981, \"prop\": 0.00025702876155264676, \"cum_prop\": 0.05152437835931778}, {\"match_weight\": 12.48, \"match_probability\": 0.99982, \"prop\": 0.0005140575231052935, \"cum_prop\": 0.05203843489289284}, {\"match_weight\": 12.56, \"match_probability\": 0.99983, \"prop\": 0.0007117718923836946, \"cum_prop\": 0.05275020748376846}, {\"match_weight\": 12.65, \"match_probability\": 0.99984, \"prop\": 0.0014037723885849118, \"cum_prop\": 0.05415397882461548}, {\"match_weight\": 12.75, \"match_probability\": 0.99985, \"prop\": 0.0014433152973651886, \"cum_prop\": 0.05559729412198067}, {\"match_weight\": 12.85, \"match_probability\": 0.99986, \"prop\": 0.0015224009985104203, \"cum_prop\": 0.0571196973323822}, {\"match_weight\": 12.97, \"match_probability\": 0.99987, \"prop\": 0.0010083435336127877, \"cum_prop\": 0.05812804028391838}, {\"match_weight\": 13.08, \"match_probability\": 0.99988, \"prop\": 0.0012456008698791265, \"cum_prop\": 0.059373639523983}, {\"match_weight\": 13.22, \"match_probability\": 0.99989, \"prop\": 0.0016805726336315274, \"cum_prop\": 0.06105421483516693}, {\"match_weight\": 13.36, \"match_probability\": 0.9999, \"prop\": 0.0014235437847673893, \"cum_prop\": 0.062477756291627884}, {\"match_weight\": 13.52, \"match_probability\": 0.99991, \"prop\": 0.0010083435336127877, \"cum_prop\": 0.06348609924316406}, {\"match_weight\": 13.7, \"match_probability\": 0.99992, \"prop\": 0.0024121159221976995, \"cum_prop\": 0.06589821726083755}, {\"match_weight\": 13.9, \"match_probability\": 0.99993, \"prop\": 0.0033413737546652555, \"cum_prop\": 0.06923958659172058}, {\"match_weight\": 14.15, \"match_probability\": 0.99994, \"prop\": 0.0033413737546652555, \"cum_prop\": 0.0725809633731842}, {\"match_weight\": 14.44, \"match_probability\": 0.99995, \"prop\": 0.0030843450222164392, \"cum_prop\": 0.07566531002521515}, {\"match_weight\": 14.8, \"match_probability\": 0.99996, \"prop\": 0.003954288549721241, \"cum_prop\": 0.07961959391832352}, {\"match_weight\": 15.29, \"match_probability\": 0.99997, \"prop\": 0.005812804214656353, \"cum_prop\": 0.08543240278959274}, {\"match_weight\": 16.02, \"match_probability\": 0.99998, \"prop\": 0.008343548513948917, \"cum_prop\": 0.09377595037221909}, {\"match_weight\": 17.61, \"match_probability\": 0.99999, \"prop\": 0.020483214408159256, \"cum_prop\": 0.11425916105508804}]}}, {\"mode\": \"vega-lite\"});\n",
              "</script>"
            ],
            "text/plain": [
              "alt.LayerChart(...)"
            ]
          },
          "execution_count": 14,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "linker.evaluation.unlinkables_chart()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "id": "4624c6a0-a1a8-4762-9003-b3da5aa45a77",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>match_weight</th>\n",
              "      <th>match_probability</th>\n",
              "      <th>unique_id_l</th>\n",
              "      <th>unique_id_r</th>\n",
              "      <th>first_name_l</th>\n",
              "      <th>first_name_r</th>\n",
              "      <th>gamma_first_name</th>\n",
              "      <th>tf_first_name_l</th>\n",
              "      <th>tf_first_name_r</th>\n",
              "      <th>bf_first_name</th>\n",
              "      <th>...</th>\n",
              "      <th>bf_birth_place</th>\n",
              "      <th>bf_tf_adj_birth_place</th>\n",
              "      <th>occupation_l</th>\n",
              "      <th>occupation_r</th>\n",
              "      <th>gamma_occupation</th>\n",
              "      <th>tf_occupation_l</th>\n",
              "      <th>tf_occupation_r</th>\n",
              "      <th>bf_occupation</th>\n",
              "      <th>bf_tf_adj_occupation</th>\n",
              "      <th>match_key</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>27.149493</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>Q2296770-1</td>\n",
              "      <td>Q2296770-12</td>\n",
              "      <td>thomas</td>\n",
              "      <td>rhomas</td>\n",
              "      <td>0</td>\n",
              "      <td>0.028667</td>\n",
              "      <td>0.000059</td>\n",
              "      <td>0.455194</td>\n",
              "      <td>...</td>\n",
              "      <td>160.713933</td>\n",
              "      <td>4.179108</td>\n",
              "      <td>politician</td>\n",
              "      <td>politician</td>\n",
              "      <td>1</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>22.916859</td>\n",
              "      <td>0.441273</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1.627242</td>\n",
              "      <td>0.755454</td>\n",
              "      <td>Q2296770-1</td>\n",
              "      <td>Q2296770-15</td>\n",
              "      <td>thomas</td>\n",
              "      <td>clifford,</td>\n",
              "      <td>0</td>\n",
              "      <td>0.028667</td>\n",
              "      <td>0.000020</td>\n",
              "      <td>0.455194</td>\n",
              "      <td>...</td>\n",
              "      <td>0.154550</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>politician</td>\n",
              "      <td>&lt;NA&gt;</td>\n",
              "      <td>-1</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>NaN</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>29.206505</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>Q2296770-1</td>\n",
              "      <td>Q2296770-3</td>\n",
              "      <td>thomas</td>\n",
              "      <td>tom</td>\n",
              "      <td>0</td>\n",
              "      <td>0.028667</td>\n",
              "      <td>0.012948</td>\n",
              "      <td>0.455194</td>\n",
              "      <td>...</td>\n",
              "      <td>160.713933</td>\n",
              "      <td>4.179108</td>\n",
              "      <td>politician</td>\n",
              "      <td>politician</td>\n",
              "      <td>1</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>22.916859</td>\n",
              "      <td>0.441273</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>13.783027</td>\n",
              "      <td>0.999929</td>\n",
              "      <td>Q2296770-1</td>\n",
              "      <td>Q2296770-7</td>\n",
              "      <td>thomas</td>\n",
              "      <td>tom</td>\n",
              "      <td>0</td>\n",
              "      <td>0.028667</td>\n",
              "      <td>0.012948</td>\n",
              "      <td>0.455194</td>\n",
              "      <td>...</td>\n",
              "      <td>0.154550</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>politician</td>\n",
              "      <td>&lt;NA&gt;</td>\n",
              "      <td>-1</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>NaN</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>29.206505</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>Q2296770-2</td>\n",
              "      <td>Q2296770-3</td>\n",
              "      <td>thomas</td>\n",
              "      <td>tom</td>\n",
              "      <td>0</td>\n",
              "      <td>0.028667</td>\n",
              "      <td>0.012948</td>\n",
              "      <td>0.455194</td>\n",
              "      <td>...</td>\n",
              "      <td>160.713933</td>\n",
              "      <td>4.179108</td>\n",
              "      <td>politician</td>\n",
              "      <td>politician</td>\n",
              "      <td>1</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>0.088932</td>\n",
              "      <td>22.916859</td>\n",
              "      <td>0.441273</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 38 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "   match_weight  match_probability unique_id_l  unique_id_r first_name_l  \\\n",
              "0     27.149493           1.000000  Q2296770-1  Q2296770-12       thomas   \n",
              "1      1.627242           0.755454  Q2296770-1  Q2296770-15       thomas   \n",
              "2     29.206505           1.000000  Q2296770-1   Q2296770-3       thomas   \n",
              "3     13.783027           0.999929  Q2296770-1   Q2296770-7       thomas   \n",
              "4     29.206505           1.000000  Q2296770-2   Q2296770-3       thomas   \n",
              "\n",
              "  first_name_r  gamma_first_name  tf_first_name_l  tf_first_name_r  \\\n",
              "0       rhomas                 0         0.028667         0.000059   \n",
              "1    clifford,                 0         0.028667         0.000020   \n",
              "2          tom                 0         0.028667         0.012948   \n",
              "3          tom                 0         0.028667         0.012948   \n",
              "4          tom                 0         0.028667         0.012948   \n",
              "\n",
              "   bf_first_name  ...  bf_birth_place bf_tf_adj_birth_place occupation_l  \\\n",
              "0       0.455194  ...      160.713933              4.179108   politician   \n",
              "1       0.455194  ...        0.154550              1.000000   politician   \n",
              "2       0.455194  ...      160.713933              4.179108   politician   \n",
              "3       0.455194  ...        0.154550              1.000000   politician   \n",
              "4       0.455194  ...      160.713933              4.179108   politician   \n",
              "\n",
              "   occupation_r  gamma_occupation tf_occupation_l tf_occupation_r  \\\n",
              "0    politician                 1        0.088932        0.088932   \n",
              "1          <NA>                -1        0.088932             NaN   \n",
              "2    politician                 1        0.088932        0.088932   \n",
              "3          <NA>                -1        0.088932             NaN   \n",
              "4    politician                 1        0.088932        0.088932   \n",
              "\n",
              "   bf_occupation  bf_tf_adj_occupation match_key  \n",
              "0      22.916859              0.441273         1  \n",
              "1       1.000000              1.000000         1  \n",
              "2      22.916859              0.441273         1  \n",
              "3       1.000000              1.000000         1  \n",
              "4      22.916859              0.441273         1  \n",
              "\n",
              "[5 rows x 38 columns]"
            ]
          },
          "execution_count": 15,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "df_predict = linker.inference.predict()\n",
        "df_e = df_predict.as_pandas_dataframe(limit=5)\n",
        "df_e"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "id": "cf6b3c45-1031-4ab0-9398-94d731117e2c",
      "metadata": {},
      "source": [
        "You can also view rows in this dataset as a waterfall chart as follows:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "id": "c2f47ebb-3181-4db6-89ba-1ef60df3bba7",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "<style>\n",
              "  #altair-viz-8399dda501cd48c795032b55d92e89ab.vega-embed {\n",
              "    width: 100%;\n",
              "    display: flex;\n",
              "  }\n",
              "\n",
              "  #altair-viz-8399dda501cd48c795032b55d92e89ab.vega-embed details,\n",
              "  #altair-viz-8399dda501cd48c795032b55d92e89ab.vega-embed details summary {\n",
              "    position: relative;\n",
              "  }\n",
              "</style>\n",
              "<div id=\"altair-viz-8399dda501cd48c795032b55d92e89ab\"></div>\n",
              "<script type=\"text/javascript\">\n",
              "  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
              "  (function(spec, embedOpt){\n",
              "    let outputDiv = document.currentScript.previousElementSibling;\n",
              "    if (outputDiv.id !== \"altair-viz-8399dda501cd48c795032b55d92e89ab\") {\n",
              "      outputDiv = document.getElementById(\"altair-viz-8399dda501cd48c795032b55d92e89ab\");\n",
              "    }\n",
              "    const paths = {\n",
              "      \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n",
              "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n",
              "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.17.0?noext\",\n",
              "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n",
              "    };\n",
              "\n",
              "    function maybeLoadScript(lib, version) {\n",
              "      var key = `${lib.replace(\"-\", \"\")}_version`;\n",
              "      return (VEGA_DEBUG[key] == version) ?\n",
              "        Promise.resolve(paths[lib]) :\n",
              "        new Promise(function(resolve, reject) {\n",
              "          var s = document.createElement('script');\n",
              "          document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
              "          s.async = true;\n",
              "          s.onload = () => {\n",
              "            VEGA_DEBUG[key] = version;\n",
              "            return resolve(paths[lib]);\n",
              "          };\n",
              "          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
              "          s.src = paths[lib];\n",
              "        });\n",
              "    }\n",
              "\n",
              "    function showError(err) {\n",
              "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
              "      throw err;\n",
              "    }\n",
              "\n",
              "    function displayChart(vegaEmbed) {\n",
              "      vegaEmbed(outputDiv, spec, embedOpt)\n",
              "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
              "    }\n",
              "\n",
              "    if(typeof define === \"function\" && define.amd) {\n",
              "      requirejs.config({paths});\n",
              "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
              "    } else {\n",
              "      maybeLoadScript(\"vega\", \"5\")\n",
              "        .then(() => maybeLoadScript(\"vega-lite\", \"5.17.0\"))\n",
              "        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
              "        .catch(showError)\n",
              "        .then(() => displayChart(vegaEmbed));\n",
              "    }\n",
              "  })({\"config\": {\"view\": {\"continuousWidth\": 400, \"continuousHeight\": 300}}, \"layer\": [{\"layer\": [{\"mark\": \"rule\", \"encoding\": {\"color\": {\"value\": \"black\"}, \"size\": {\"value\": 0.5}, \"y\": {\"field\": \"zero\", \"type\": \"quantitative\"}}}, {\"mark\": {\"type\": \"bar\", \"width\": 60}, \"encoding\": {\"color\": {\"condition\": {\"test\": \"(datum.log2_bayes_factor < 0)\", \"value\": \"red\"}, \"value\": \"green\"}, \"opacity\": {\"condition\": {\"test\": \"datum.column_name == 'Prior match weight' || datum.column_name == 'Final score'\", \"value\": 1}, \"value\": 0.5}, \"tooltip\": [{\"field\": \"column_name\", \"title\": \"Comparison column\", \"type\": \"nominal\"}, {\"field\": \"value_l\", \"title\": \"Value (L)\", \"type\": \"nominal\"}, {\"field\": \"value_r\", \"title\": \"Value (R)\", \"type\": \"nominal\"}, {\"field\": \"label_for_charts\", \"title\": \"Label\", \"type\": \"ordinal\"}, {\"field\": \"sql_condition\", \"title\": \"SQL condition\", \"type\": \"nominal\"}, {\"field\": \"comparison_vector_value\", \"title\": \"Comparison vector value\", \"type\": \"nominal\"}, {\"field\": \"bayes_factor\", \"format\": \",.4f\", \"title\": \"Bayes factor = m/u\", \"type\": \"quantitative\"}, {\"field\": \"log2_bayes_factor\", \"format\": \",.4f\", \"title\": \"Match weight = log2(m/u)\", \"type\": \"quantitative\"}, {\"field\": \"prob\", \"format\": \".4f\", \"title\": \"Cumulative match probability\", \"type\": \"quantitative\"}, {\"field\": \"bayes_factor_description\", \"title\": \"Match weight description\", \"type\": \"nominal\"}], \"x\": {\"axis\": {\"grid\": true, \"labelAlign\": \"center\", \"labelAngle\": -20, \"labelExpr\": \"datum.value == 'Prior' || datum.value == 'Final score' ? '' : datum.value\", \"labelPadding\": 10, \"tickBand\": \"extent\", \"title\": \"Column\"}, \"field\": \"column_name\", \"sort\": {\"field\": \"bar_sort_order\", \"order\": \"ascending\"}, \"type\": \"nominal\"}, \"y\": {\"axis\": {\"grid\": false, \"orient\": \"left\", \"title\": \"Match Weight\"}, \"field\": \"previous_sum\", \"type\": \"quantitative\"}, \"y2\": {\"field\": \"sum\"}}}, {\"mark\": {\"type\": \"text\", \"fontWeight\": \"bold\"}, \"encoding\": {\"color\": {\"value\": \"white\"}, \"text\": {\"condition\": {\"test\": \"abs(datum.log2_bayes_factor) > 1\", \"field\": \"log2_bayes_factor\", \"format\": \".2f\", \"type\": \"nominal\"}, \"value\": \"\"}, \"x\": {\"axis\": {\"labelAngle\": -20, \"title\": \"Column\"}, \"field\": \"column_name\", \"sort\": {\"field\": \"bar_sort_order\", \"order\": \"ascending\"}, \"type\": \"nominal\"}, \"y\": {\"axis\": {\"orient\": \"left\"}, \"field\": \"center\", \"type\": \"quantitative\"}}}, {\"mark\": {\"type\": \"text\", \"baseline\": \"bottom\", \"dy\": -25, \"fontWeight\": \"bold\"}, \"encoding\": {\"color\": {\"value\": \"black\"}, \"text\": {\"field\": \"column_name\", \"type\": \"nominal\"}, \"x\": {\"axis\": {\"labelAngle\": -20, \"title\": \"Column\"}, \"field\": \"column_name\", \"sort\": {\"field\": \"bar_sort_order\", \"order\": \"ascending\"}, \"type\": \"nominal\"}, \"y\": {\"field\": \"sum_top\", \"type\": \"quantitative\"}}}, {\"mark\": {\"type\": \"text\", \"baseline\": \"bottom\", \"dy\": -13, \"fontSize\": 8}, \"encoding\": {\"color\": {\"value\": \"grey\"}, \"text\": {\"field\": \"value_l\", \"type\": \"nominal\"}, \"x\": {\"axis\": {\"labelAngle\": -20, \"title\": \"Column\"}, \"field\": \"column_name\", \"sort\": {\"field\": \"bar_sort_order\", \"order\": \"ascending\"}, \"type\": \"nominal\"}, \"y\": {\"field\": \"sum_top\", \"type\": \"quantitative\"}}}, {\"mark\": {\"type\": \"text\", \"baseline\": \"bottom\", \"dy\": -5, \"fontSize\": 8}, \"encoding\": {\"color\": {\"value\": \"grey\"}, \"text\": {\"field\": \"value_r\", \"type\": \"nominal\"}, \"x\": {\"axis\": {\"labelAngle\": -20, \"title\": \"Column\"}, \"field\": \"column_name\", \"sort\": {\"field\": \"bar_sort_order\", \"order\": \"ascending\"}, \"type\": \"nominal\"}, \"y\": {\"field\": \"sum_top\", \"type\": \"quantitative\"}}}]}, {\"mark\": {\"type\": \"rule\", \"color\": \"black\", \"strokeWidth\": 2, \"x2Offset\": 30, \"xOffset\": -30}, \"encoding\": {\"x\": {\"axis\": {\"labelAngle\": -20, \"title\": \"Column\"}, \"field\": \"column_name\", \"sort\": {\"field\": \"bar_sort_order\", \"order\": \"ascending\"}, \"type\": \"nominal\"}, \"x2\": {\"field\": \"lead\"}, \"y\": {\"axis\": {\"labelExpr\": \"format(1 / (1 + pow(2, -1*datum.value)), '.2r')\", \"orient\": \"right\", \"title\": \"Probability\"}, \"field\": \"sum\", \"scale\": {\"zero\": false}, \"type\": \"quantitative\"}}}], \"data\": {\"name\": \"data-6135fb3ceb421e246e70bd834b48f595\"}, \"height\": 450, \"params\": [{\"name\": \"record_number\", \"bind\": {\"input\": \"range\", \"max\": 4, \"min\": 0, \"step\": 1}, \"value\": 0}], \"resolve\": {\"axis\": {\"y\": \"independent\"}}, \"title\": {\"text\": \"Match weights waterfall chart\", \"subtitle\": \"How each comparison contributes to the final match score\"}, \"transform\": [{\"filter\": \"(datum.record_number == record_number)\"}, {\"window\": [{\"op\": \"sum\", \"field\": \"log2_bayes_factor\", \"as\": \"sum\"}, {\"op\": \"lead\", \"field\": \"column_name\", \"as\": \"lead\"}], \"frame\": [null, 0]}, {\"calculate\": \"datum.column_name === \\\"Final score\\\" ? datum.sum - datum.log2_bayes_factor : datum.sum\", \"as\": \"sum\"}, {\"calculate\": \"datum.lead === null ? datum.column_name : datum.lead\", \"as\": \"lead\"}, {\"calculate\": \"datum.column_name === \\\"Final score\\\" || datum.column_name === \\\"Prior match weight\\\" ? 0 : datum.sum - datum.log2_bayes_factor\", \"as\": \"previous_sum\"}, {\"calculate\": \"datum.sum > datum.previous_sum ? datum.column_name : \\\"\\\"\", \"as\": \"top_label\"}, {\"calculate\": \"datum.sum < datum.previous_sum ? datum.column_name : \\\"\\\"\", \"as\": \"bottom_label\"}, {\"calculate\": \"datum.sum > datum.previous_sum ? datum.sum : datum.previous_sum\", \"as\": \"sum_top\"}, {\"calculate\": \"datum.sum < datum.previous_sum ? datum.sum : datum.previous_sum\", \"as\": \"sum_bottom\"}, {\"calculate\": \"(datum.sum + datum.previous_sum) / 2\", \"as\": \"center\"}, {\"calculate\": \"(datum.log2_bayes_factor > 0 ? \\\"+\\\" : \\\"\\\") + datum.log2_bayes_factor\", \"as\": \"text_log2_bayes_factor\"}, {\"calculate\": \"datum.sum < datum.previous_sum ? 4 : -4\", \"as\": \"dy\"}, {\"calculate\": \"datum.sum < datum.previous_sum ? \\\"top\\\" : \\\"bottom\\\"\", \"as\": \"baseline\"}, {\"calculate\": \"1. / (1 + pow(2, -1.*datum.sum))\", \"as\": \"prob\"}, {\"calculate\": \"0*datum.sum\", \"as\": \"zero\"}], \"width\": {\"step\": 75}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.9.3.json\", \"datasets\": {\"data-6135fb3ceb421e246e70bd834b48f595\": [{\"column_name\": \"Prior\", \"label_for_charts\": \"Starting match weight (prior)\", \"sql_condition\": null, \"log2_bayes_factor\": -12.845746707461347, \"bayes_factor\": 0.00013584539607096294, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 0, \"record_number\": 0}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 0.4551935528867936, \"log2_bayes_factor\": -1.1354479706438325, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"first_name\", \"value_l\": \"thomas\", \"value_r\": \"rhomas\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 1, \"record_number\": 0}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"tf_first_name\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 2, \"record_number\": 0}, {\"sql_condition\": \"\\\"surname_l\\\" = \\\"surname_r\\\"\", \"label_for_charts\": \"Exact match on surname\", \"m_probability\": 0.7798213256103345, \"u_probability\": 0.0007567808197398964, \"bayes_factor\": 1030.4454146688827, \"log2_bayes_factor\": 10.00905236831435, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on surname` then comparison is 1,030.45 times more likely to be a match\", \"column_name\": \"surname\", \"value_l\": \"chudleigh\", \"value_r\": \"chudleigh\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 3, \"record_number\": 0}, {\"sql_condition\": \"\\\"dob_l\\\" = \\\"dob_r\\\"\", \"label_for_charts\": \"Exact match on dob\", \"m_probability\": 0.6198138543741121, \"u_probability\": 0.0019758263205186424, \"bayes_factor\": 313.6985513035451, \"log2_bayes_factor\": 8.293235056437856, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on dob` then comparison is 313.70 times more likely to be a match\", \"column_name\": \"dob\", \"value_l\": \"1630-08-01\", \"value_r\": \"1630-08-01\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 4, \"record_number\": 0}, {\"sql_condition\": \"levenshtein_distance(\\\"postcode_fake_l\\\", \\\"postcode_fake_r\\\") <= 1\", \"label_for_charts\": \"Levenshtein distance of postcode_fake <= 1\", \"m_probability\": 0.08663919768424605, \"u_probability\": 7.900978825044774e-05, \"bayes_factor\": 1096.5628386398703, \"log2_bayes_factor\": 10.098772772821805, \"comparison_vector_value\": 2, \"bayes_factor_description\": \"If comparison level is `levenshtein distance of postcode_fake <= 1` then comparison is 1,096.56 times more likely to be a match\", \"column_name\": \"postcode_fake\", \"value_l\": \"tq13 8df\", \"value_r\": \"tq13 8dg\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 5, \"record_number\": 0}, {\"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Exact match on birth_place\", \"m_probability\": 0.8462634396206725, \"u_probability\": 0.005265650721787716, \"bayes_factor\": 160.7139334401887, \"log2_bayes_factor\": 7.328351201760086, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on birth_place` then comparison is 160.71 times more likely to be a match\", \"column_name\": \"birth_place\", \"value_l\": \"devon\", \"value_r\": \"devon\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 6, \"record_number\": 0}, {\"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Term freq adjustment on birth_place with weight {cl.tf_adjustment_weight}\", \"m_probability\": null, \"u_probability\": null, \"bayes_factor\": 4.179107630122829, \"log2_bayes_factor\": 2.06319491478497, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"Term frequency adjustment on birth_place makes comparison 4.18 times more likely to be a match\", \"column_name\": \"tf_birth_place\", \"value_l\": \"devon\", \"value_r\": \"devon\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 7, \"record_number\": 0}, {\"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Exact match on occupation\", \"m_probability\": 0.8993324609099845, \"u_probability\": 0.03924326946398254, \"bayes_factor\": 22.91685869179151, \"log2_bayes_factor\": 4.518337396383288, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on occupation` then comparison is 22.92 times more likely to be a match\", \"column_name\": \"occupation\", \"value_l\": \"politician\", \"value_r\": \"politician\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 8, \"record_number\": 0}, {\"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Term freq adjustment on occupation with weight {cl.tf_adjustment_weight}\", \"m_probability\": null, \"u_probability\": null, \"bayes_factor\": 0.44127302866814333, \"log2_bayes_factor\": -1.1802565247677892, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"Term frequency adjustment on occupation makes comparison  2.27 times less likely to be a match\", \"column_name\": \"tf_occupation\", \"value_l\": \"politician\", \"value_r\": \"politician\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 9, \"record_number\": 0}, {\"column_name\": \"Final score\", \"label_for_charts\": \"Final score\", \"sql_condition\": null, \"log2_bayes_factor\": 27.149492507629386, \"bayes_factor\": 148871516.31468534, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 10, \"record_number\": 0}, {\"column_name\": \"Prior\", \"label_for_charts\": \"Starting match weight (prior)\", \"sql_condition\": null, \"log2_bayes_factor\": -12.845746707461347, \"bayes_factor\": 0.00013584539607096294, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 0, \"record_number\": 1}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 0.4551935528867936, \"log2_bayes_factor\": -1.1354479706438325, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"first_name\", \"value_l\": \"thomas\", \"value_r\": \"clifford,\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 1, \"record_number\": 1}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"tf_first_name\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 2, \"record_number\": 1}, {\"sql_condition\": \"\\\"surname_l\\\" = \\\"surname_r\\\"\", \"label_for_charts\": \"Exact match on surname\", \"m_probability\": 0.7798213256103345, \"u_probability\": 0.0007567808197398964, \"bayes_factor\": 1030.4454146688827, \"log2_bayes_factor\": 10.00905236831435, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on surname` then comparison is 1,030.45 times more likely to be a match\", \"column_name\": \"surname\", \"value_l\": \"chudleigh\", \"value_r\": \"chudleigh\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 3, \"record_number\": 1}, {\"sql_condition\": \"\\\"dob_l\\\" = \\\"dob_r\\\"\", \"label_for_charts\": \"Exact match on dob\", \"m_probability\": 0.6198138543741121, \"u_probability\": 0.0019758263205186424, \"bayes_factor\": 313.6985513035451, \"log2_bayes_factor\": 8.293235056437856, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on dob` then comparison is 313.70 times more likely to be a match\", \"column_name\": \"dob\", \"value_l\": \"1630-08-01\", \"value_r\": \"1630-08-01\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 4, \"record_number\": 1}, {\"sql_condition\": \"\\\"postcode_fake_l\\\" IS NULL OR \\\"postcode_fake_r\\\" IS NULL\", \"label_for_charts\": \"postcode_fake is NULL\", \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": -1, \"bayes_factor_description\": \"If comparison level is `postcode_fake is null` then comparison is 1.00 times more likely to be a match\", \"column_name\": \"postcode_fake\", \"value_l\": \"tq13 8df\", \"value_r\": \"None\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 5, \"record_number\": 1}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.15373656037932748, \"u_probability\": 0.9947343492782122, \"bayes_factor\": 0.15455036863950666, \"log2_bayes_factor\": -2.693850999515206, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  6.47 times less likely to be a match\", \"column_name\": \"birth_place\", \"value_l\": \"devon\", \"value_r\": \"west devon\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 6, \"record_number\": 1}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.15373656037932748, \"u_probability\": 0.9947343492782122, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  6.47 times less likely to be a match\", \"column_name\": \"tf_birth_place\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 7, \"record_number\": 1}, {\"sql_condition\": \"\\\"occupation_l\\\" IS NULL OR \\\"occupation_r\\\" IS NULL\", \"label_for_charts\": \"occupation is NULL\", \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": -1, \"bayes_factor_description\": \"If comparison level is `occupation is null` then comparison is 1.00 times more likely to be a match\", \"column_name\": \"occupation\", \"value_l\": \"politician\", \"value_r\": \"None\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 8, \"record_number\": 1}, {\"sql_condition\": \"\\\"occupation_l\\\" IS NULL OR \\\"occupation_r\\\" IS NULL\", \"label_for_charts\": \"occupation is NULL\", \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": -1, \"bayes_factor_description\": \"If comparison level is `occupation is null` then comparison is 1.00 times more likely to be a match\", \"column_name\": \"tf_occupation\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 9, \"record_number\": 1}, {\"column_name\": \"Final score\", \"label_for_charts\": \"Final score\", \"sql_condition\": null, \"log2_bayes_factor\": 1.6272417471318208, \"bayes_factor\": 3.089218137984875, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 10, \"record_number\": 1}, {\"column_name\": \"Prior\", \"label_for_charts\": \"Starting match weight (prior)\", \"sql_condition\": null, \"log2_bayes_factor\": -12.845746707461347, \"bayes_factor\": 0.00013584539607096294, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 0, \"record_number\": 2}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 0.4551935528867936, \"log2_bayes_factor\": -1.1354479706438325, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"first_name\", \"value_l\": \"thomas\", \"value_r\": \"tom\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 1, \"record_number\": 2}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"tf_first_name\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 2, \"record_number\": 2}, {\"sql_condition\": \"\\\"surname_l\\\" = \\\"surname_r\\\"\", \"label_for_charts\": \"Exact match on surname\", \"m_probability\": 0.7798213256103345, \"u_probability\": 0.0007567808197398964, \"bayes_factor\": 1030.4454146688827, \"log2_bayes_factor\": 10.00905236831435, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on surname` then comparison is 1,030.45 times more likely to be a match\", \"column_name\": \"surname\", \"value_l\": \"chudleigh\", \"value_r\": \"chudleigh\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 3, \"record_number\": 2}, {\"sql_condition\": \"\\\"dob_l\\\" = \\\"dob_r\\\"\", \"label_for_charts\": \"Exact match on dob\", \"m_probability\": 0.6198138543741121, \"u_probability\": 0.0019758263205186424, \"bayes_factor\": 313.6985513035451, \"log2_bayes_factor\": 8.293235056437856, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on dob` then comparison is 313.70 times more likely to be a match\", \"column_name\": \"dob\", \"value_l\": \"1630-08-01\", \"value_r\": \"1630-08-01\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 4, \"record_number\": 2}, {\"sql_condition\": \"\\\"postcode_fake_l\\\" = \\\"postcode_fake_r\\\"\", \"label_for_charts\": \"Exact match on postcode_fake\", \"m_probability\": 0.687726663068843, \"u_probability\": 0.00015071615069623225, \"bayes_factor\": 4563.058835379581, \"log2_bayes_factor\": 12.155785540453788, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on postcode_fake` then comparison is 4,563.06 times more likely to be a match\", \"column_name\": \"postcode_fake\", \"value_l\": \"tq13 8df\", \"value_r\": \"tq13 8df\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 5, \"record_number\": 2}, {\"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Exact match on birth_place\", \"m_probability\": 0.8462634396206725, \"u_probability\": 0.005265650721787716, \"bayes_factor\": 160.7139334401887, \"log2_bayes_factor\": 7.328351201760086, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on birth_place` then comparison is 160.71 times more likely to be a match\", \"column_name\": \"birth_place\", \"value_l\": \"devon\", \"value_r\": \"devon\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 6, \"record_number\": 2}, {\"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Term freq adjustment on birth_place with weight {cl.tf_adjustment_weight}\", \"m_probability\": null, \"u_probability\": null, \"bayes_factor\": 4.179107630122829, \"log2_bayes_factor\": 2.06319491478497, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"Term frequency adjustment on birth_place makes comparison 4.18 times more likely to be a match\", \"column_name\": \"tf_birth_place\", \"value_l\": \"devon\", \"value_r\": \"devon\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 7, \"record_number\": 2}, {\"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Exact match on occupation\", \"m_probability\": 0.8993324609099845, \"u_probability\": 0.03924326946398254, \"bayes_factor\": 22.91685869179151, \"log2_bayes_factor\": 4.518337396383288, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on occupation` then comparison is 22.92 times more likely to be a match\", \"column_name\": \"occupation\", \"value_l\": \"politician\", \"value_r\": \"politician\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 8, \"record_number\": 2}, {\"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Term freq adjustment on occupation with weight {cl.tf_adjustment_weight}\", \"m_probability\": null, \"u_probability\": null, \"bayes_factor\": 0.44127302866814333, \"log2_bayes_factor\": -1.1802565247677892, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"Term frequency adjustment on occupation makes comparison  2.27 times less likely to be a match\", \"column_name\": \"tf_occupation\", \"value_l\": \"politician\", \"value_r\": \"politician\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 9, \"record_number\": 2}, {\"column_name\": \"Final score\", \"label_for_charts\": \"Final score\", \"sql_condition\": null, \"log2_bayes_factor\": 29.20650527526137, \"bayes_factor\": 619489794.7651293, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 10, \"record_number\": 2}, {\"column_name\": \"Prior\", \"label_for_charts\": \"Starting match weight (prior)\", \"sql_condition\": null, \"log2_bayes_factor\": -12.845746707461347, \"bayes_factor\": 0.00013584539607096294, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 0, \"record_number\": 3}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 0.4551935528867936, \"log2_bayes_factor\": -1.1354479706438325, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"first_name\", \"value_l\": \"thomas\", \"value_r\": \"tom\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 1, \"record_number\": 3}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"tf_first_name\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 2, \"record_number\": 3}, {\"sql_condition\": \"\\\"surname_l\\\" = \\\"surname_r\\\"\", \"label_for_charts\": \"Exact match on surname\", \"m_probability\": 0.7798213256103345, \"u_probability\": 0.0007567808197398964, \"bayes_factor\": 1030.4454146688827, \"log2_bayes_factor\": 10.00905236831435, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on surname` then comparison is 1,030.45 times more likely to be a match\", \"column_name\": \"surname\", \"value_l\": \"chudleigh\", \"value_r\": \"chudleigh\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 3, \"record_number\": 3}, {\"sql_condition\": \"\\\"dob_l\\\" = \\\"dob_r\\\"\", \"label_for_charts\": \"Exact match on dob\", \"m_probability\": 0.6198138543741121, \"u_probability\": 0.0019758263205186424, \"bayes_factor\": 313.6985513035451, \"log2_bayes_factor\": 8.293235056437856, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on dob` then comparison is 313.70 times more likely to be a match\", \"column_name\": \"dob\", \"value_l\": \"1630-08-01\", \"value_r\": \"1630-08-01\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 4, \"record_number\": 3}, {\"sql_condition\": \"\\\"postcode_fake_l\\\" = \\\"postcode_fake_r\\\"\", \"label_for_charts\": \"Exact match on postcode_fake\", \"m_probability\": 0.687726663068843, \"u_probability\": 0.00015071615069623225, \"bayes_factor\": 4563.058835379581, \"log2_bayes_factor\": 12.155785540453788, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on postcode_fake` then comparison is 4,563.06 times more likely to be a match\", \"column_name\": \"postcode_fake\", \"value_l\": \"tq13 8df\", \"value_r\": \"tq13 8df\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 5, \"record_number\": 3}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.15373656037932748, \"u_probability\": 0.9947343492782122, \"bayes_factor\": 0.15455036863950666, \"log2_bayes_factor\": -2.693850999515206, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  6.47 times less likely to be a match\", \"column_name\": \"birth_place\", \"value_l\": \"devon\", \"value_r\": \"west devon\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 6, \"record_number\": 3}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.15373656037932748, \"u_probability\": 0.9947343492782122, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  6.47 times less likely to be a match\", \"column_name\": \"tf_birth_place\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 7, \"record_number\": 3}, {\"sql_condition\": \"\\\"occupation_l\\\" IS NULL OR \\\"occupation_r\\\" IS NULL\", \"label_for_charts\": \"occupation is NULL\", \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": -1, \"bayes_factor_description\": \"If comparison level is `occupation is null` then comparison is 1.00 times more likely to be a match\", \"column_name\": \"occupation\", \"value_l\": \"politician\", \"value_r\": \"None\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 8, \"record_number\": 3}, {\"sql_condition\": \"\\\"occupation_l\\\" IS NULL OR \\\"occupation_r\\\" IS NULL\", \"label_for_charts\": \"occupation is NULL\", \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": -1, \"bayes_factor_description\": \"If comparison level is `occupation is null` then comparison is 1.00 times more likely to be a match\", \"column_name\": \"tf_occupation\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 9, \"record_number\": 3}, {\"column_name\": \"Final score\", \"label_for_charts\": \"Final score\", \"sql_condition\": null, \"log2_bayes_factor\": 13.783027287585611, \"bayes_factor\": 14096.284118946756, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 10, \"record_number\": 3}, {\"column_name\": \"Prior\", \"label_for_charts\": \"Starting match weight (prior)\", \"sql_condition\": null, \"log2_bayes_factor\": -12.845746707461347, \"bayes_factor\": 0.00013584539607096294, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 0, \"record_number\": 4}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 0.4551935528867936, \"log2_bayes_factor\": -1.1354479706438325, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"first_name\", \"value_l\": \"thomas\", \"value_r\": \"tom\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 1, \"record_number\": 4}, {\"sql_condition\": \"ELSE\", \"label_for_charts\": \"All other comparisons\", \"m_probability\": 0.4493209267752062, \"u_probability\": 0.9870986175565454, \"bayes_factor\": 1.0, \"log2_bayes_factor\": 0.0, \"comparison_vector_value\": 0, \"bayes_factor_description\": \"If comparison level is `all other comparisons` then comparison is  2.20 times less likely to be a match\", \"column_name\": \"tf_first_name\", \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 2, \"record_number\": 4}, {\"sql_condition\": \"\\\"surname_l\\\" = \\\"surname_r\\\"\", \"label_for_charts\": \"Exact match on surname\", \"m_probability\": 0.7798213256103345, \"u_probability\": 0.0007567808197398964, \"bayes_factor\": 1030.4454146688827, \"log2_bayes_factor\": 10.00905236831435, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on surname` then comparison is 1,030.45 times more likely to be a match\", \"column_name\": \"surname\", \"value_l\": \"chudleigh\", \"value_r\": \"chudleigh\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 3, \"record_number\": 4}, {\"sql_condition\": \"\\\"dob_l\\\" = \\\"dob_r\\\"\", \"label_for_charts\": \"Exact match on dob\", \"m_probability\": 0.6198138543741121, \"u_probability\": 0.0019758263205186424, \"bayes_factor\": 313.6985513035451, \"log2_bayes_factor\": 8.293235056437856, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on dob` then comparison is 313.70 times more likely to be a match\", \"column_name\": \"dob\", \"value_l\": \"1630-08-01\", \"value_r\": \"1630-08-01\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 4, \"record_number\": 4}, {\"sql_condition\": \"\\\"postcode_fake_l\\\" = \\\"postcode_fake_r\\\"\", \"label_for_charts\": \"Exact match on postcode_fake\", \"m_probability\": 0.687726663068843, \"u_probability\": 0.00015071615069623225, \"bayes_factor\": 4563.058835379581, \"log2_bayes_factor\": 12.155785540453788, \"comparison_vector_value\": 3, \"bayes_factor_description\": \"If comparison level is `exact match on postcode_fake` then comparison is 4,563.06 times more likely to be a match\", \"column_name\": \"postcode_fake\", \"value_l\": \"tq13 8df\", \"value_r\": \"tq13 8df\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 5, \"record_number\": 4}, {\"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Exact match on birth_place\", \"m_probability\": 0.8462634396206725, \"u_probability\": 0.005265650721787716, \"bayes_factor\": 160.7139334401887, \"log2_bayes_factor\": 7.328351201760086, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on birth_place` then comparison is 160.71 times more likely to be a match\", \"column_name\": \"birth_place\", \"value_l\": \"devon\", \"value_r\": \"devon\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 6, \"record_number\": 4}, {\"sql_condition\": \"\\\"birth_place_l\\\" = \\\"birth_place_r\\\"\", \"label_for_charts\": \"Term freq adjustment on birth_place with weight {cl.tf_adjustment_weight}\", \"m_probability\": null, \"u_probability\": null, \"bayes_factor\": 4.179107630122829, \"log2_bayes_factor\": 2.06319491478497, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"Term frequency adjustment on birth_place makes comparison 4.18 times more likely to be a match\", \"column_name\": \"tf_birth_place\", \"value_l\": \"devon\", \"value_r\": \"devon\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 7, \"record_number\": 4}, {\"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Exact match on occupation\", \"m_probability\": 0.8993324609099845, \"u_probability\": 0.03924326946398254, \"bayes_factor\": 22.91685869179151, \"log2_bayes_factor\": 4.518337396383288, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"If comparison level is `exact match on occupation` then comparison is 22.92 times more likely to be a match\", \"column_name\": \"occupation\", \"value_l\": \"politician\", \"value_r\": \"politician\", \"term_frequency_adjustment\": false, \"bar_sort_order\": 8, \"record_number\": 4}, {\"sql_condition\": \"\\\"occupation_l\\\" = \\\"occupation_r\\\"\", \"label_for_charts\": \"Term freq adjustment on occupation with weight {cl.tf_adjustment_weight}\", \"m_probability\": null, \"u_probability\": null, \"bayes_factor\": 0.44127302866814333, \"log2_bayes_factor\": -1.1802565247677892, \"comparison_vector_value\": 1, \"bayes_factor_description\": \"Term frequency adjustment on occupation makes comparison  2.27 times less likely to be a match\", \"column_name\": \"tf_occupation\", \"value_l\": \"politician\", \"value_r\": \"politician\", \"term_frequency_adjustment\": true, \"bar_sort_order\": 9, \"record_number\": 4}, {\"column_name\": \"Final score\", \"label_for_charts\": \"Final score\", \"sql_condition\": null, \"log2_bayes_factor\": 29.20650527526137, \"bayes_factor\": 619489794.7651293, \"comparison_vector_value\": null, \"m_probability\": null, \"u_probability\": null, \"bayes_factor_description\": null, \"value_l\": \"\", \"value_r\": \"\", \"term_frequency_adjustment\": null, \"bar_sort_order\": 10, \"record_number\": 4}]}}, {\"mode\": \"vega-lite\"});\n",
              "</script>"
            ],
            "text/plain": [
              "alt.LayerChart(...)"
            ]
          },
          "execution_count": 17,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "records_to_plot = df_e.to_dict(orient=\"records\")\n",
        "linker.visualisations.waterfall_chart(records_to_plot, filter_nulls=False)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "id": "4c8f021b-49e7-4f9e-ad32-72066084470d",
      "metadata": {},
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "Completed iteration 1, root rows count 641\n",
            "Completed iteration 2, root rows count 187\n",
            "Completed iteration 3, root rows count 251\n",
            "Completed iteration 4, root rows count 75\n",
            "Completed iteration 5, root rows count 23\n",
            "Completed iteration 6, root rows count 30\n",
            "Completed iteration 7, root rows count 34\n",
            "Completed iteration 8, root rows count 30\n",
            "Completed iteration 9, root rows count 9\n",
            "Completed iteration 10, root rows count 5\n",
            "Completed iteration 11, root rows count 0\n"
          ]
        }
      ],
      "source": [
        "clusters = linker.clustering.cluster_pairwise_predictions_at_threshold(\n",
        "    df_predict, threshold_match_probability=0.95\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "8b036e11-e15c-4196-a268-faf62c2ec85a",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "        <iframe\n",
              "            width=\"100%\"\n",
              "            height=\"1200\"\n",
              "            src=\"./dashboards/50k_cluster.html\"\n",
              "            frameborder=\"0\"\n",
              "            allowfullscreen\n",
              "            \n",
              "        ></iframe>\n",
              "        "
            ],
            "text/plain": [
              "<IPython.lib.display.IFrame at 0x111148ac0>"
            ]
          },
          "execution_count": 2,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "linker.visualisations.cluster_studio_dashboard(\n",
        "    df_predict,\n",
        "    clusters,\n",
        "    \"dashboards/50k_cluster.html\",\n",
        "    sampling_method=\"by_cluster_size\",\n",
        "    overwrite=True,\n",
        ")\n",
        "\n",
        "from IPython.display import IFrame\n",
        "\n",
        "IFrame(src=\"./dashboards/50k_cluster.html\", width=\"100%\", height=1200)"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "venv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.8"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}
